Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Mar 1.
Published in final edited form as: J Geriatr Psychiatry Neurol. 2022 Jun 2;36(2):164–170. doi: 10.1177/08919887221106446

Identifying quality indicators for nursing home residents with dementia: A modified Delphi method

Jennifer G Burgess 1, Donovan T Maust 1,2, Ming-Un Myron Chang 1, Kara Zivin 1,2, Lauren B Gerlach 2
PMCID: PMC9715812  NIHMSID: NIHMS1842213  PMID: 35654789

INTRODUCTION

No medications have U.S. Food and Drug Administration (FDA) approval to treat behavioral disturbances in dementia, yet off-label use of psychotropic medications such as antipsychotics (APs) remains common.1 Policymakers focused on reducing AP use in dementia following the 2005 and 2008 FDA black box warnings, cautioning that APs have associations with increased mortality when prescribed for treatment of dementia-related behaviors.2 In 2012, the Center for Medicare & Medicaid Services’ (CMS) established the National Partnership to Improve Dementia Care in Nursing Homes to improve the quality of nursing home dementia care. They measured as facility-level percentage of long-stay residents who received an AP (excluding residents diagnosed with schizophrenia, Huntington’s Disease, and Tourette’s Syndrome),3 which CMS began publicly reporting as a quality indicator in 2015 through the Nursing Home Care Compare website and the Five Star Quality Rating System.4 In 2018, the U.S. Department of Veterans Affairs (VA) developed a similar publicly-reported rating system adapted from CMS, called Community Living Center (CLC) Compare, which tracks the percentage of Veterans prescribed APs.5 However, neither the CMS nor VA AP prescribing quality indicator specifically focus on patients with dementia.

Multiple expert bodies, such as the American Psychiatric Association and the American Geriatrics Society, recommend using behavioral and environmental interventions as preferred treatment for behavioral disturbances in dementia.68 Introduction of the CMS AP quality indicator does not appear to have sped up the decline of AP use—instead, it may have contributed to substitution of AP with other psychotropic medications.9 Thus, policies focused on reducing AP use may unintentionally have led to providers shifting patients to alternative psychotropic classes with less evidence of benefit and similar risks.10

Currently, the rate of AP prescribing within a nursing home typically serves as the sole means for evaluating the quality of dementia management in long-term care facilities.11 However, low AP use does not necessarily indicate quality dementia care; a facility could have low AP use but high use of alternative measures (e.g., physical restraints, other psychotropic medications) and low use of non-pharmacological strategies. Thus, low AP use alone does not adequately represent dementia care quality at a facility.

In this context, we convened a modified Delphi panel to identify and reach consensus on additional potential quality indicators (QIs) that facilities can readily assess from claims data for long-stay VA nursing home residents with dementia and could more accurately capture quality of dementia care.

METHODS

Quality Indicators

The investigative team reviewed literature to identify possible additional QIs for nursing home residents with dementia, focusing on medication classes potentially used to treat behavioral disturbances in dementia and indicators that could reflect meaningful clinical decompensation; we focused on indicators available through administrative claims and therefore feasibly implemented in healthcare settings. Selected potential QIs represent measures generated from VA claims data such as other medication use and health care utilization, and measures generated from Minimum Data Set (MDS) data, such as physical restraint use and change in behavioral symptom score computed using the Aggressive Behavior Scale (ABS).1215 Unlike the current CMS and VA AP quality indicators pertaining to antipsychotic prescribing, we limited the measures to patients with dementia to increase the specificity of the measures.

Panel Selection

We selected the Delphi panel participants to represent a range of expertise related to dementia for Veterans. Ten experts agreed to serve on the panel, representing seven VA medical centers and two VA national offices (Table 1). Experts included geriatric psychiatry, geropsychology, geriatric medicine, nursing, pharmacy, and research. We assessed positions within VA through self-report on our first survey, and potential responses included broad categories to protect the privacy of the participants.

Table 1:

Panelist demographic characteristics per self-report

N1 (%)
Current position within VA CLC
 Researcher (e.g., of geriatric care, policies, interventions) 5 (50)
 Psychiatrist 2 (20)
 Director (e.g., of CLC, Geriatrics division) 1 (1)
 Other (e.g., retired, trainer of VA CLC staff) 2 (20)
Years of experience in VA CLC
 1 to <5 years 3 (30)
 10 years or more 7 (70)
Years of experience in a non-VA nursing home
 None 1 (10)
 <1 year 1(10)
 1 to <5 years 3 (30)
 5 to <10 years 1 (10)
 10 years or more 4 (40)
Years of experience in dementia care (both VA and non-VA)
 5 to <10 years 1 (10)
 10 years or more 9 (90)
1

N range = 0 to 10

The VA Central IRB (CIRB) reviewed the panel aims and determined that panelists would act in an advisory function, not as research subjects.

Modified Delphi Panel

The Delphi method is a systematic technique that includes rounds of survey ranking with anonymous feedback on responses provided between rounds to move subsequent responses towards convergence.16 We modified the standard Delphi process using the RAND/UCLA Appropriateness Method (RAM)17 applied virtually. We collected ratings asynchronously via electronic survey and held our in-person meeting virtually. Our e-Delphi had an initial round of rating, a subsequent virtual face-to-face discussion with all panelists, and a final round of rating. We chose to use two rating rounds a priori to minimize the burden on participants and for scheduling purposes; a common RAM protocol used to observe consensus features two rounds of rating.17, 18 Evaluation of online modified Delphi panels has shown them as an acceptable adaptation of consensus-building deliberations of the Delphi method, especially when combined with in-person discussion.19

Prior to the first round of ratings, panelists received a Panelist Handbook that contained our proposed QIs alongside our FY2018 VA CLC statistics on patients with dementia, presented as averages within AP-prescribing quartiles, and cited literature regarding existing recommendations, evidence for harm, and non-VA statistics for each of the QIs. The Panelist Handbook also contained instructions on how to rate the QIs on importance, usefulness, and feasibility (Table 2).

Table 2:

Rating Instructions Provided to Panelists

Rating Criteria When should a measure receive a high rating?
1: Importance Rate a measure high on importance if:
1. The aspect of care covered by the measure is important to high-quality care for patients with dementia.
2. The measure has significance and relevance to stakeholders.
3. The measure represents an opportunity for improvement.
2: Usefulness Rate a measure high on usefulness if:
1. The measure could have a substantial impact on clinical care.
2. You would likely use or encourage incorporating this measure for quality improvement in your practice.
3. The numerator includes a clinically significant population.
*Please consider opportunities in VA (data presented) and outside VA (data not presented except when available in the literature) when you are rating.
3: Feasibility Rate a measure high on feasibility if:
1. The data necessary to calculate the measure is readily available.
2. The data necessary to calculate the measure is likely reliable.
3. The data necessary to calculate the measure is likely unbiased.

Rating Scales and Definition of Consensus

Respondents used a 9-point Likert to rate the importance, usefulness, and feasibility of 12 QIs, for a total of 36 items on each survey. Consensus for an item was achieved when ≥70% of responses fell within a three-point range surrounding the median (i.e., median score +/− 1 point). In a systematic review of 250 Delphi and other consensus group methods, the definition of consensus varied greatly, from 20% to 100%.20 The RAM manual “classic” definition of agreement is ≤2 panelists rate outside of the 3-point range of the median. Since we had a 10-member panel, we opted to permit ≤3 panelists to rate outside of the median range. We selected thresholds for high, equivocal, and low guided by the RAM manual, which recommends classification into three levels: 1–3, 4–6, and 7–9.17 Relevance for an item was achieved when the median was 7–9 and consensus was reached.

Panelists received individualized feedback following the first survey in a results summary. This summary included the specified panelist’s response, the median, the distribution of responses from all panelists, and whether the panel achieved consensus.

Data Collection

Our team pilot tested the Panelist Handbook and the online survey with a colleague who did not serve as a panelist to assess readability and time commitment. We emailed the first online survey link to panelists in January 2021 and the survey remained open for 2 weeks. We emailed the second online survey link in February 2021 and the survey remained open for 3 weeks. The virtual meeting occurred between survey rounds. Panelists could respond to the second survey immediately following the virtual meeting. Within the online survey, we included our VA FY2018 statistics for each indicator and referred panelists to the Panelist Handbook. For the second survey, we advised panelists to refer to their results summary from the first survey, and the notes provided after the virtual meeting.

Panelists rated all 12 QIs on importance, usefulness, and feasibility in both survey rounds. In addition, the first survey contained demographic questions to capture the panel composition, and the second survey contained questions assessing panelists’ satisfaction with the various components of the Delphi process. Last, we asked panelists during the group discussion if they thought we should include any additional QIs.

RESULTS

All 10 panelists completed both surveys and responded to all items. Figure 1 shows the second survey’s distribution of ratings.

Figure 1:

Figure 1:

Distribution of ratings on survey 2

Table 3 shows the median scores from the second round for each item, where the panel achieved consensus. Twenty-four items achieved consensus, with 15 of those having a median ≥7 and thus designated relevant. Of these, seven QIs reached relevance for importance, three for usefulness, and five for feasibility. Two QIs reached relevance on all three domains: 1) percent of CLC patients with dementia prescribed APs and 2) percent of CLC patients with dementia prescribed benzodiazepines. An additional QI, percent of CLC patients with dementia with physical restraint use, reached consensus on all three domains, but only reached relevance on importance and usefulness. This contrasts to our round one survey where 21 items achieved consensus, with 15 of those designated relevant, and only the benzodiazepine QI reached relevance on all three domains (Table 4).

Table 3:

Median scores on Survey 2 items reaching consensus

Measures
Percent of CLC patients with dementia with any:
Importance1 Usefulness2 Feasibility3
Physical restraint use 9 8 6
Antipsychotic use 9 7 7
Positive behavioral symptom score 9
Benzodiazepine use 8 8 7
Anxiolytic/sedative/hypnotic use 7 7
Opioid use 7 4
Inpatient hospitalization 7 6 5
Antidepressant use 6 7
Antiepileptic use 6
Memory medication use 7
Non-antipsychotic psychotropic use 6
Emergency Department visit 6 6
1

Scale: Likert scale where 1=Not at all important and 9=Extremely important

2

Scale: Likert scale where 1=Not at all useful and 9=Extremely useful

3

Scale: Likert scale where 1=Not at all feasible and 9=Extremely feasible

*

Greyed out cells indicate where consensus was not achieved

Table 4:

Median scores on Survey 1 items reaching consensus

Measures
Percent of CLC patients with dementia with any:
Importance1 Usefulness2 Feasibility3
Physical restraint use 9 8 6
Antipsychotic use 9 7
Benzodiazepine use 8 8 7
Positive behavioral symptom score 8 5
Anxiolytic/sedative/hypnotic use 7 6 7
Emergency Department visit 7 6 6
Inpatient hospitalization 7
Non-antipsychotic psychotropic use 4 7
Opioid use 7
Antidepressant use
Antiepileptic use
Memory medication use 7
1

Scale: Likert scale where 1=Not at all important and 9=Extremely important

2

Scale: Likert scale where 1=Not at all useful and 9=Extremely useful

3

Scale: Likert scale where 1=Not at all feasible and 9=Extremely feasible

*

Greyed out cells indicate where consensus was not achieved

Three QIs reached consensus at the highest median score (9) for importance: 1) percent of CLC patients with dementia with any physical restraint use, 2) AP use, and 3) positive behavioral symptom score. Four QIs reached consensus at a median score of 7–8 for importance: 1) percent of CLC patients with dementia with any benzodiazepine use, 2) anxiolytic/sedative/hypnotic use, 3) opioid use, and 4) inpatient hospitalization. Three QIs did not achieve relevance (median of 7–9 with consensus) on any of the three domains: 1) percent of CLC patients with dementia with any antiepileptic use, 2) non-AP psychotropic use, and 3) Emergency Department visit. One item, the usefulness of percent of CLC patients with dementia with any opioid use, achieved consensus at a median <5. During our virtual meeting, multiple panelists said they saw this QI as having low usefulness because of difficulty of differentiating between acute and chronic usage of opioids. Panelist also commented on limited usefulness of indicators of other medication use, including antidepressant, antiepileptic, and memory medication use given the difficulty in excluding patients with appropriate use for other clinical conditions (e.g., depression, seizure disorder) from the measure denominator. Panelists did not suggest additional QIs from the ones proposed.

DISCUSSION

The study team convened a modified Delphi panel to identify potential additional QIs for nursing home residents with dementia, to generate more QIs to better capture the quality of dementia care. However, besides the existing AP QI, the panel only reached consensus on all three items for one additional QI: percent of CLC residents with dementia with benzodiazepine use. As benzodiazepines are frequently used for treatment of behavioral disturbances in dementia and carry a similar level of potential harm as APs,19, 2123 our results show that assessing benzodiazepine prescribing alongside AP prescribing when evaluating dementia care quality could prove important, useful, and feasible.

Our Delphi panel further highlighted complexities of assessing quality dementia care through administrative claims-based indicators, particularly with few available options for capturing non-pharmacological care. Overall, results did not identify a single ideal indicator; none achieved the highest median score on all three elements with consensus among panel members. Physical restraint use ranked the highest overall in combined importance and usefulness but had equivocal feasibility. Positive behavioral symptom score (e.g., marker of agitation, aggression, or behavioral disturbances) also received the highest ranking in importance but did not achieve consensus on usefulness or feasibility. Advancing the ability to measure dementia care quality would benefit from new means to capture delivery of behavioral or environmental interventions, such as MDS items to reflect use of behavioral interventions or additional meaningful patient outcomes such as quality of life.

Besides the ability to capture different types of data, a theme from panelists during the virtual meeting was the need for additional population exclusions from denominators for medication-based QIs. Panelists noted that the “right” amount of medication use for nursing home residents with dementia is not zero for any of the potential QIs; therefore, as with the AP measure, implementing any of the QI alternatives without relevant denominator exclusions would reduce potentially appropriate medication use. For example, patients with dementia might have bipolar that warrants AP use, pain that warrants opioid use, or seizure disorder that warrants antiepileptic use. Excluding people who have appropriate indications for use of a specific medication class would more accurately capture potentially inappropriate use of these medications, though this then creates the potential that these exclusionary diagnoses are applied in order to circumvent relevant metrics.24 Given challenges with obtaining accurate diagnostic information, the panel recognized the need for QI measures that balance accuracy with ease of operationalization.

A potential limitation of this study includes our selection of QIs specific to VA CLC residents with dementia. This may limit generalizability outside of VA, given that the VA population is primarily male, VA CLCs use a national formulary, and the VA provides an array of psychosocial services to Veterans.25 However, many similarities exist between the VA and community care in regards to behavioral challenges for those with dementia, staffing issues, and general patterns of prescription drug use. Additionally, others may construct our proposed QIs in the Medicare population using claims and MDS data.

This study had multiple strengths. Our experienced team of investigators developed and presented QIs to panelists with data generated from rigorous analysis of VA and MDS data in addition to published data and resources. Others may readily construct our proposed QIs in the Medicare population using claims and MDS data.

Our panelists included experts in dementia care, representing a range of expertise, with a combined minimum of 95 years of experience in dementia care. All our panelists had experience working within VA CLCs (combined minimum of 73 years of experience) and all panelists except one had experience with employment in a non-VA nursing home (combined minimum of 48 years of experience). We opted not include a caregiver in our panel given its focus on the logistics and technical specifications of QI numerators and denominators. As with any Delphi panel, another group of experts may have reached different conclusions.

The modifications employed in our Delphi process were effective for enhancing participation, ease of analysis, and follow-up with delayed responders. Our a priori decision to hold two rounds of ranking proved appropriate, given the lack of movement toward consensus between rating rounds.

Our modified Delphi process included a time lag for some panelists between the virtual meeting and their responses to the second survey, representing a potential limitation in the approach. In our Delphi, the second online survey became available to panelists immediately following the virtual meeting, but panelists had up to three weeks to complete the final survey. The mean number of days to respond to the survey was 6.9 days post-virtual meeting, with 50% of panelists responding within one day. The study team provided notes from the virtual meeting to the panelists immediately following the meeting and advised panelists to have both these and their individual rating sheets from the first survey on-hand when responding to the second survey. However, for panelists who let some time pass between the meeting and responding to the survey, the group discussion may have had less of an impact on their re-ratings than had they responded to the survey earlier.

CONCLUSIONS AND IMPLICATIONS

Of the proposed QIs, our panel of dementia care experts only reached consensus on two QIs, including measuring the percent of long-stay residents prescribed APs and benzodiazepines. Given the ample evidence for harms with use of benzodiazepines in dementia, policymakers should consider this QI in addition to AP use. Challenges remain in identifying additional QIs that meet the threshold of all three areas of importance, usefulness, and feasibility and accurately reflect quality nursing home dementia care. Future initiatives aimed at improving the quality of care for nursing home residents with dementia should focus on how to measure the use of recommended evidence-based non-pharmacologic alternatives. Pharmacologic QIs should include a nuanced consideration of who gets included in the denominator and exclude those who have conditions with appropriate use indications.

Acknowledgments

The authors would like to thank the 10 experts who served on the panel for their time and valuable input, including: Joan Carpenter, Kim Curyto, Reiko Emtman, Michele Karel, Mark Kunik, Eleanor McConnell, Ritamarie Moscola, James Rudolph, Todd Semla, and Joel Streim. The authors also thank Ilse Wiechers for participating in pilot activities.

Disclosures:

We conducted this work virtually through the VA Ann Arbor Healthcare System’s Center for Clinical Management Research. This project, IIR 15–330 (Zivin) received funding from the VA Health Services Research & Development. Dr. Gerlach had support from, in part, grant K23AG066864 for the National Institute on Aging. The data has not previously appeared orally or by poster at scientific meetings.

Footnotes

DISCLOSURE/CONFLICT OF INTEREST

The authors report no conflicts with any product mentioned or concept discussed in this article.

REFERENCES

  • 1.Lucas JA, Bowblis JR. CMS Strategies To Reduce Antipsychotic Drug Use In Nursing Home Patients With Dementia Show Some Progress. Health Affairs. Jul 1 2017;36(7):1299–1308. [DOI] [PubMed] [Google Scholar]
  • 2.US Food and Drug Administration. Public Health Advisory: Deaths with Antipsychotics in Elderly Patients with Behavioral Disturbances; 2005.
  • 3.Centers for Medicare and Medicaid Services. Data show National Partnership to Improve Dementia Care achieves goals to reduce unnecessary antipsychotic medications in nursing homes; 2017. [Google Scholar]
  • 4.Centers for Medicare and Medicaid Services. Five-Star Quality Rating System. Available at: https://www.cms.gov/Medicare/Provider-Enrollment-and-Certification/CertificationandComplianc/FSQRS.
  • 5.Department of Veterans Affairs. Quality of care: Community Living Center Health Surveys. Available at: https://www.va.gov/QUALITYOFCARE/apps/aspire/clcsurvey.aspx.
  • 6.American Geriatrics Society. Ten Things Physicians and Patients Should Question. Available at: http://www.choosingwisely.org/societies/american-geriatrics-society/.
  • 7.Kales HC, Gitlin LN, Lyketsos CG, Detroit Expert Panel on A, Management of Neuropsychiatric Symptoms of D. Management of neuropsychiatric symptoms of dementia in clinical settings: recommendations from a multidisciplinary expert panel. Journal of the American Geriatrics Society. Apr 2014;62(4):762–769. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.American Psychiatric Association. Five Things Physicians and Patients Should Question. Available at: https://www.choosingwisely.org/wp-content/uploads/2015/02/AGS-Choosing-Wisely-List.pdf.
  • 9.Kales HC, Zivin K, Kim HM, et al. Trends in Antipsychotic Use in Dementia 1999–2007. Archives of General Psychiatry. 2011;68(2):190–197. [DOI] [PubMed] [Google Scholar]
  • 10.Maust DT, Kim HM, Chiang C, Kales HC. Association of the Centers for Medicare & Medicaid Services’ National Partnership to Improve Dementia Care With the Use of Antipsychotics and Other Psychotropics in Long-term Care in the United States From 2009 to 2014. JAMA Internal Medicine. May 1 2018;178(5):640–647. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Centers for Medicare and Medicaid Services. Fiscal Year (FY) 2020 Mission & Priority document (MPD) 2019.
  • 12.McCreedy E, Ogarek JA, Thomas KS, Mor V. The Minimum Data Set Agitated and Reactive Behavior Scale: Measuring Behaviors in Nursing Home Residents With Dementia. Journal of the American Medical Directors Association. Dec 2019;20(12):1548–1552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Perlman CM, Hirdes JP. The aggressive behavior scale: a new scale to measure aggression based on the minimum data set. Journal of the American Geriatrics Society. Dec 2008;56(12):2298–2303. [DOI] [PubMed] [Google Scholar]
  • 14.Ahn H, Horgas A. The relationship between pain and disruptive behaviors in nursing home residents with dementia. BMC Geriatrics. Feb 11 2013;13:14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Cen X, Li Y, Hasselberg M, Caprio T, Conwell Y, Temkin-Greener H. Aggressive Behaviors Among Nursing Home Residents: Association With Dementia and Behavioral Health Disorders. Journal of the American Medical Directors Association. Dec 2018;19(12):1104–1109 e1104. [DOI] [PubMed] [Google Scholar]
  • 16.Turoff M, Linstone HA, eds. The Delphi method-techniques and applications. Massachusetts: Addison-Wesley Publishing Company; 2002. [Google Scholar]
  • 17.Fitch K, Bernstein SJ, Aguilar MD, et al. RAND/UCLA Appropriateness Method User’s Manual. Arlington, VA: RAND; 2001. [Google Scholar]
  • 18.Jandhyala R. Delphi, non-RAND modified Delphi, RAND/UCLA appropriateness method and a novel group awareness and consensus methodology for consensus measurement: a systematic literature review. Current Medical Research and Opinion. 2020/11/01 2020;36(11):1873–1887. [DOI] [PubMed] [Google Scholar]
  • 19.Nørgaard A, Jensen-Dahm C, Gasse C, Wimberley T, Hansen ES, Waldemar G. Association of Benzodiazepines and Antidepressants With 180-Day Mortality Among Patients With Dementia Receiving Antipsychotic Pharmacotherapy: A Nationwide Registry-Based Study. Journal of Clinical Psychiatry. 2020;81(4). [DOI] [PubMed] [Google Scholar]
  • 20.Humphrey-Murto S, Varpio L, Wood TJ, et al. The Use of the Delphi and Other Consensus Group Methods in Medical Education Research: A Review. Academic Medicine. 2017;92(10):1491–1498. [DOI] [PubMed] [Google Scholar]
  • 21.Fick D, Semla T, Beizer J, et al. American Geriatrics Society Updated Beers Criteria for Potentially Inappropriate Medication Use in Older Adults. Journal of the American Geriatrics Society. Apr 2012;60(4):616–631. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Taipale H, Tolppanen AM, Koponen M, et al. Risk of pneumonia associated with incident benzodiazepine use among community-dwelling adults with Alzheimer disease. Canadian Medical Association Journal. Apr 10 2017;189(14):E519–E529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.American Psychiatric Association. Practice Guideline for the Treatment of Patients With Alzheimer’s Disease and Other Dementias 2010.
  • 24.Fashaw-Walters SA, McCreedy E, Bynum JPW, Thomas KS, Shireman TI. Disproportionate increases in schizophrenia diagnoses among Black nursing home residents with ADRD. Journal of the American Geriatrics Society. Dec 2021;69(12):3623–3630. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Thomas KS, Cote D, Makineni R, et al. Change in VA Community Living Centers 2004–2011: Shifting Long-Term Care to the Community. Journal of Aging and Social Policy. Mar-Apr 2018;30(2):93–108. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES