Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Apr 11.
Published in final edited form as: J Am Med Dir Assoc. 2021 Aug;22(8):1606–1608. doi: 10.1016/j.jamda.2021.06.020

What Clinicians Need to Know About Measurement

Sheryl Zimmerman 1,*
PMCID: PMC8996809  NIHMSID: NIHMS1792042  PMID: 34334161

The August issue of JAMDA features numerous articles that advance measurement in post-acute and long-term care. Measurement is of critical importance in clinical care: it matters for screening, assessing, predicting, treating, dosing, and monitoring. However, few clinicians understand critical issues underlying measurement itself, despite the fact that such understanding is important to digest research findings and conclusions and their application to practice. This editorial presents 5 of those issues: measurement in relation to causality, relevance, adequacy, psychometrics, and inference.

Just Because 2 Things Are Associated Statistically Does Not Mean That One Causes the Other

The logic behind “association does not equal causality” is largely predicated on the fact that a third variable may be responsible for any given association.1 For example, older adults with less education report fewer benefits from rehabilitation than do those with a college degree2; surely, the physiologic benefits of rehabilitation are not based on an individual’s education. However, in some cases association may suggest causality, and although a third variable may be the operative cause, it may be embedded within the phenomenon. For example, COVID-19 incidence and mortality rates were less in Green House and small nursing homes than other nursing homes, which may in part be due to or caused by their smaller size that limits the number of infectious people who enter the building.3

Although the logic of these examples is likely clear, numerous published articles assert causality based simply on association, which may mislead a casual reader. [“Casual” is not meant to disparage readers; instead, it recognizes that many readers (1) focus on the abstract, which often omits important details, and (2) do not actually have access to entire articles so as to read those important details.]

  • Bottom Line: Readers must be cautious when titles assert “outcomes,” “influence,” or “impact” absent a randomized trial4,5 or based on a small sample size.6,7

Measurement Does Not Ensure Relevance

The use of data from administrative and clinical records to assess care and outcomes has become common and will continue to grow given the utility of large samples.8 Consider, for example, how much we learned about COVID-19 in long-term care thanks to administrative data.911 The National Institute on Aging (NIA) is especially invested in the promise of data existing in health care systems, and funded the IMbedded Pragmatic Alzheimer’s disease (AD) and AD-Related Dementias (AD/ADRD) Clinical Trials (IMPACT) Collaboratory to build capacity and promote pragmatic clinical trials in these systems.12

IMPACT recognizes, however, that not all data contained in these databases are relevant to all stakeholders, and conversely, that some data relevant to stakeholders are not contained in these databases. In fact, outcome measures reflecting the lived experience of persons with AD/ADRD and their caregivers are rarely available in large clinical and administrative data sources.13 One such limitation might be attenuated based on the findings of an article published in the August issue of JAMDA, which recommended that Nursing Home Compare be improved by adding consumer assessments of quality.14

  • Bottom line: Readers should consider whether the care processes and outcomes being examined are important to the end user; what is easily measured should not dictate what is relevant.

Existing Measures Are Not Always Adequate

In addition to improving existing measures, there is need to develop new measures. As a case in point, shortly after the Alzheimer’s Association set forth the 2018 Dementia Care Practice Recommendations that address care in 9 domains ranging from detection and diagnosis, to care for activities of daily living (ADL), to transitions and coordination of services,15 it became apparent that existing measures do not sufficiently assess care and outcomes in each of these domains. Consider, for example, that a measure of well-being following a new diagnosis of Alzheimer’s disease may be very different from a measure of well-being related to ADL care or following a transition in care.16 The NIA funded LINC-AD (Leveraging an Interdisciplinary Consortium to Improve Care and Outcomes for Persons Living with Alzheimer’s and Dementia) to expressly promote new measurement development in these areas.

  • Bottom line: Readers should look carefully at whether findings and conclusions regarding a given construct are based on sufficient measurement (ie, use a valid measurement tool).

Not All Research Uses Psychometrically Sound Measures

Simply stated, a measure that is psychometrically sound measures what it intends to measure. Among other criteria, psychometrically sound measures are valid; they (1) appear suitable to their intent (face validity), (2) include all aspects of the concept being measured (content validity), (3) test the concept that is intended to be measured (construct validity), (4) yield results similar to those measured by instruments of the same or related concepts (criterion validity), (5) yield results that differentiate groups that are expected to differ (discriminate validity), and (6) predict a future expected state (predictive validity).

Psychometrically sound measures also are reliable; they (1) generate the same response when used over time (test-retest reliability) and (2) across different raters (inter-rater reliability), and (3) the items within a given measure are scored similarly (internal consistency). In addition, measures must be appropriate for a given population, because if they exhibit floor or ceiling effects—meaning most people score near the bottom or the top—they aren’t useful to indicate decline (if people already score at the floor) or improvement (if people already score at the top). Also, some measures may not be sensitive enough to detect clinically meaningful change. If a measure is not reliable or valid, exhibits floor or ceiling effects, or is not sensitive to change, results are suspect.

Nine of the articles published in the August issue of JAMDA focus on measurement evaluation, related to measures of arousal,17 frailty,18 function,19,20 and prediction.2125 However, although many JAMDA authors are expert in measurement, few clinicians are.

  • Bottom line: Readers should review the methods section of an article to learn whether the measures have been used extensively in other research (suggesting they are sound), and whether the authors report psychometric properties.

Measurement Itself May Be Misleading

Clinical treatments may have side effects; so too can the process of measurement. The “Hawthorne effect” is a phenomenon with which many are familiar; it refers to behavior change due to the awareness of being observed, typically in the direction of improvement. However, few appreciate that measurement can have a contrary effect—for example, when a falls reduction quality improvement project results in more documented falls after project initiation than at baseline merely because the staff become more diligent about recording falls. In such a case, treatment benefits may be minimized.

A second misleading characteristic is that some measures include items reflecting both care and outcomes, in which case it becomes challenging to disentangle whether the ultimate intent of care was achieved [eg, the Quality of Dying in Long-Term Care (QOD-LTC) scale includes items such as “resident received compassionate physical touch daily” (care) and “resident appeared to be at peace” (outcome)].26,27

  • Bottom line: Readers should reflect on whether human nature may affect measurement, and what items compose a scale.

The Bottom Bottom Line

  • As an end user of research, clinicians should be educated consumers regarding measurement, and not draw conclusions without considering how measurement may have impacted the article’s findings and relevance.

Acknowledgments

This work was supported by the National Institute on Aging (NIA) of the National Institutes of Health under award U54AG063546 [which funds the NIA Imbedded Pragmatic Alzheimer’s Disease (AD) and AD-Related Dementias Clinical Trials (IMPACT) Collaboratory] and also NIA Award R24AG065185 (which funds LINC-AD: Leveraging an Interdisciplinary Consortium to Improve Care and Outcomes for Persons Living with Alzheimer’s and Dementia). The content is solely the responsibility of the author and does not necessarily represent the official views of the National Institutes of Health.

References

  • 1.Rohrer JM. Thinking clearly about correlations and causation: Graphical causal models for observational data. Adv Methods Pract Psychol Sci; 2018:27–42. [Google Scholar]
  • 2.Simning A, Caprio TV, Seplaki CL, et al. Patient-reported outcomes in functioning following nursing home or inpatient rehabilitation. J Am Med Dir Assoc 2018;19:864–870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Zimmerman S, Dumond-Stryker C, Tandan M, et al. Nontraditional small house nursing homes have fewer COVID-19 cases and deaths. J Am Med Dir Assoc 2021;22:489–493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Isaac V, Kuot A, Hamiduzzaman M, et al. The outcomes of a person-centered, non-pharmacological intervention in reducing agitation in residents with dementia in Australian rural nursing homes. BMC Geriatr 2021; 21:193. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Johansson M, Athilingam P. Structured telephone support intervention: Improved heart failure outcomes. JMIR Aging 2020;3:e13513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Dahms R, Eicher C, Haesner M, Mueller-Werdan U. Influence of music therapy and music-based interventions on dementia: A pilot study. J Music Ther; 2021. [Epub ahead of print]. [DOI] [PubMed] [Google Scholar]
  • 7.Watts KJ, O’Connor M, Johnson CE, et al. Mindfulness-based compassion training for health professionals providing end-of-life care: Impact, feasibility, and acceptability. J Palliat Med; 2021. [Epub ahead of print]. [DOI] [PubMed] [Google Scholar]
  • 8.Sloane PD, Mor V, Preisser JS. Administrative data for research: An increasingly powerful tool, but still with caveats. J Am Med Dir Assoc 2018;19: 97–99. [DOI] [PubMed] [Google Scholar]
  • 9.He M, Li Y, Fang F. Is there a link between nursing home reported quality and COVID-19 cases? Evidence from California skilled nursing facilities. J Am Med Dir Assoc 2020;21:905–908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Cai S, Yan D, Intrator O. COVID-19 cases and death in nursing homes: The role of racial and ethnic composition of facilities and their communities. J Am Med Dir Assoc 2021;22:1345–1351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Thomas KS, Zhang W, Dosa DM, et al. Estimation of excess mortality rates among US assisted living residents during the COVID-19 pandemic. JAMA Netw Open 2021;4:e2113411. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.NIA IMPACT Collaboratory. Transforming Dementia Care. Available at: https://impactcollaboratory.org/overview/. Accessed June 19, 2021.
  • 13.Hanson LC, Bennett AV, Jonsson M, et al. Selecting outcomes to ensure pragmatic trials are relevant to people living with dementia. J Am Geriatr Soc 2020;68(suppl 2):S55–S61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Mukamel DB, Saliba D, Weimer DL, Ladd H. Families’ and residents’ perspectives of the quality of nursing home care: Implications for composite quality measures. J Am Med Dir Assoc 2021;22:1609–1614. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Fazio S, Pace D, Maslow K, et al. Alzheimer’s Association Dementia Care Practice Recommendations. Gerontologist 2018;58(suppl 1):S1–S9. [DOI] [PubMed] [Google Scholar]
  • 16.Gaugler JE, Bain LJ, Mitchell L, et al. Reconsidering frameworks of Alzheimer’s dementia when assessing psychosocial outcomes. Alzheimers Dement (N Y) 2019;5:388–397. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Martella LA, Carmisciano L, Giannotti C, et al. Cross-cultural adaptation and validation of the Italian version of the Observational Scale of Level of Arousal. J Am Med Dir Assoc 2021;22:1615–1620. [DOI] [PubMed] [Google Scholar]
  • 18.Nozaki K, Kamiya K, Hamazaki N, et al. Validity and utility of the questionnaire-based Frail Scale in older patients with heart failure: Findings from the FRAGILE-HF. J Am Med Dir Assoc 2021;22:1621–1626. [DOI] [PubMed] [Google Scholar]
  • 19.Smit EB, Bouwstra H, Roorda LD, et al. A patient-reported outcomes measurement information system short form for measuring physical function during geriatric rehabilitation: Test-retest reliability, construct validity, responsiveness, and interpretability. J Am Med Dir Assoc 2021;22:1627–1632. [DOI] [PubMed] [Google Scholar]
  • 20.Johnson JK, Hohman J, Stilphen M, et al. Functional recovery rate: A feasible method for evaluating and comparing rehabilitation outcomes between skilled nursing facilities. J Am Med Dir Assoc 2021;22:1633–1639. [DOI] [PubMed] [Google Scholar]
  • 21.Chong E, Huang Y, Chan M, et al. Concurrent and predictive validity of FRAIL-NH in hospitalized older persons: An exploratory study. J Am Med Dir Assoc 2021;22:1664–1669. [DOI] [PubMed] [Google Scholar]
  • 22.Choo PL, Tou NX, Pang BWJ, et al. Timed Up and Go (TUG) reference values and predictive cutoffs for fall risk and disability in Singaporean community-dwelling adults: Yishun Cross-Sectional Study and Singapore Longitudinal Aging Study. J Am Med Dir Assoc 2021;22:1640–1645. [DOI] [PubMed] [Google Scholar]
  • 23.Welch SA, Ward RE, Beauchamp MK, et al. The Short Physical Performance Battery (SPPB): A quick and useful tool for fall risk stratification among older primary care patients. J Am Med Dir Assoc 2021;22:1646–1651. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Meyer ML, Fustinoni S, Henchoz Y, et al. Slowness predicts mortality: A comparative analysis of walking speed and Moberg Picking-Up tests. J Am Med Dir Assoc 2021;22:1652–1657. [DOI] [PubMed] [Google Scholar]
  • 25.Madrigal C, Halladay CW, McConeghy K, et al. Derivation and validation of a predictive algorithm for long-term care admission or death. J Am Med Dir Assoc 2021;22:1658–1663. [DOI] [PubMed] [Google Scholar]
  • 26.Munn JC, Zimmerman S, Hanson LC, et al. Measuring the quality of dying in long-term care. J Am Geriatr Soc 2007;55:1371–1379. [DOI] [PubMed] [Google Scholar]
  • 27.van Soest-Poortvliet MC, van der Steen JT, Zimmerman S, et al. Measuring the quality of dying and quality of care when dying in long-term care settings: A qualitative content analysis of available instruments. J Pain Symptom Manage 2011;42:852–863. [DOI] [PubMed] [Google Scholar]

RESOURCES