Item Response Theory Analysis: PROMIS® Anxiety Form & Generalized Anxiety Disorder Scale

Wen Liu; Lilian Dindo; Katherine Hadlandsmyth; George Jay Unick; M Bridget Zimmerman; St Barbara Marie; Jennie Embree; Toni Tripp-Reimer; Barbara Rakel

doi:10.1177/01939459211015985

. Author manuscript; available in PMC: 2023 Aug 1.

Published in final edited form as: West J Nurs Res. 2021 May 17;44(8):765–772. doi: 10.1177/01939459211015985

Item Response Theory Analysis: PROMIS® Anxiety Form & Generalized Anxiety Disorder Scale

Wen Liu ¹, Lilian Dindo ², Katherine Hadlandsmyth ³, George Jay Unick ⁴, M Bridget Zimmerman ⁵, Barbara Marie ¹, Jennie Embree ¹, Toni Tripp-Reimer ¹, Barbara Rakel ¹

PMCID: PMC8595462 NIHMSID: NIHMS1702072 PMID: 33998340

Abstract

Little research has compared item functioning of the PROMIS^® Anxiety Short Form 6a and the Generalized Anxiety Disorder 7-Item Scale using Item Response Theory models. This was a secondary analysis of self-reported assessments from 67 at-risk U.S. military veterans. The two measures performed comparably well with data fitting adequately to models, acceptable item discriminations, and item and test information curves being unimodal and symmetric. The PROMIS^® Anxiety Short Form 6a performed better in that item difficulty estimates had a wider range and distributed more evenly and all response categories had less floor effect, while the third category in most items of the Generalized Anxiety Disorder 7-Item Scale were rarely used. While both measures may be appropriate, findings provided preliminary information supporting use of the PROMIS^® Anxiety Short Form 6a as potentially preferable, especially for veterans with low-to-moderate anxiety. Further testing is needed in larger, more diverse samples.

Keywords: Anxiety, Item Response Theory, Orthopedic, Patient-Reported Outcomes Measurement Information System^®(PROMIS^®) Short Form, Generalized Anxiety Disorder 7-Item Scale

Anxiety includes a range of psychological and physical features, including feeling nervous, tense, uneasy, worried, or fearful, and increased heart rate and breathing rates (Tuma & Maser, 2019). The use of a psychometrically sound measure is critical for identification and assessment of anxiety to identify appropriate interventions and evaluate treatment effects (Waltz et al., 2010). Among patients presenting for orthopedic surgeries, elevated anxiety poses a risk for poor long-term pain and functional outcomes (Lewis et al., 2015; Noiseux et al., 2014), highlighting the critical need for valid measures of anxiety in this population.

PROMIS^® Anxiety Short Forms vs. Generalized Anxiety Disorder 7-Item Scale (GAD-7)

Patient-Reported Outcomes Measurement Information System^® (PROMIS^®) instruments are calibrated based on Item Response Theory (IRT), allowing for the development of short forms from large item banks without sacrificing precision (National Institute of Health, 2019). IRT is a modern measurement approach that models the probability of selecting a certain item response option as the function of the person’s level of latent trait or attribute and item characteristics, and examines the performance of individual items of the (sub)scale (i.e., item difficulty and discrimination) as well as the appropriateness of response options (de Ayala, 2009). For assessing anxiety in adults using PROMIS^® instruments, there are two administration options: 1) computer adaptive test (CAT) using a 29-item bank (an item pool for selection of appropriate items based on individuals’ responses), and 2) four PROMIS^® Anxiety Short Forms (4a, 6a, 7a, 8a), in which items were selected based on content expert review and item rankings (National Institute of Health, 2019). Three short forms are nested: 6a includes all items in the 4a plus two additional items, and 8a includes all items in the 6a plus two additional items (National Institute of Health, 2019). The short form 7a is an exception because 1 out of the 7 items is not available in the other three short forms (National Institute of Health, 2019). Using IRT, PROMIS^® Anxiety Short Forms (4a, 6a, 7a, and 8a) showed high reliability estimates and good-to-adequate model fit, were relatively invariant across participant groups studied, and performed well among ethnically diverse subgroups (Teresi et al., 2016).

CTT is a commonly used measurement framework that assumes the true score cannot be directly observed and can only be assessed indirectly through the observed test score with error (Allen & Yen, 2001). PROMIS^® Anxiety measures have been tested using Classical Test Theory (CTT) and demonstrated validity and reliability in adult populations with different conditions, including patients with osteoarthritis (Driban et al., 2015; Lee et al., 2017), multiple sclerosis (Marrie et al., 2018), lumbar degenerative disease (Purvis et al., 2018), and orthopedic conditions (Beleckas et al., 2018; Hadlandsmyth et al., 2020). While reliability and validity of different anxiety short forms are highly similar, the longest short form (8a) is recommended for use to achieve the most precise measure and the shortest form (4a) is recommended for use when there is little room for additional measures to capture anxiety as a secondary outcome (National Institute of Health., 2019). Such recommendations are primarily based on the length of forms and participant burden, as opposed to psychometric evidence in a targeted population.

The GAD-7 was initially developed to identify generalized anxiety disorder (GAD) in primary care settings (Spitzer et al., 2006). The GAD-7 has been used extensively as a gold-standard for screening, diagnosing, and assessing the severity of GAD in varied care settings. The GAD-7 showed evidence for unidimensionality and decent fit to IRT models in primary care patients (Jordan et al., 2017). In addition, the GAD-7 has been tested using CCT with established reliability and validity in different adult patient populations using CTT, including patients following coronary artery bypass surgery (Tully & Baker, 2012), orthopedic surgery (Hadlandsmyth et al., 2020), and patients with chronic depression (Toussaint et al., 2020), anxiety, and mood disorders (Rutter & Brown, 2017).

While the use of the four PROMIS^® measures and the GAD-7 in research has been increasing, the primary purpose is to calculate correlations to demonstrate validity of these measures. For example, validity of PROMIS^® Anxiety measures was established through moderate-to-strong correlation with the GAD-7 in patients presenting for spine surgery (r=0.72; p < 0.001) (Purvis et al., 2019) and an online sample of Australian adults (ICC=0.91, 95% CI=0.91, 0.92) (Sunderland et al., 2018). Prior research that examined the PROMIS^® Anxiety 20-item bank (a set of 20 items from the original 29-item bank) and the GAD-7 using IRT linking analysis found that six items in the GAD-7 (except for item 6 “irritability”) were consistent with the content coverage of the PROMIS^® items, and the combined 27-item set of the PROMIS^® and GAD-7 showed evidence of unidimensionality and adequate fit to IRT models (Schalet et al., 2014).

While both PROMIS Anxiety Short Forms and the GAD-7 have been tested using CTT- and IRT-based approaches, little research has compared the psychometric performance of the PROMIS^® Anxiety Short Forms and GAD-7 using IRT-based modeling in the same adult patient sample. Psychometric performance of any measurement is determined based on distributions of scores in a given sample, and therefore may vary by population (Waltz et al., 2010). Evidence is needed to determine which measure performs best in assessing anxiety in targeted populations.

Purpose

The purpose of this study was to compare the psychometric performance of the PROMIS^® Anxiety Short form 6a and the GAD-7 in veterans following orthopedic surgery using IRT-based models. Information obtained serves as preliminary evidence to guide the use of the more appropriate and psychometrically sound measure to assess anxiety in this population.

Methods

Study Design

This was a secondary analysis of data collected from a pilot randomized control trial that evaluated the efficacy of Acceptance and Commitment Therapy on persistent postsurgical pain in at-risk orthopedic patients (Dindo et al., 2018). Ethical approval of the parent study was obtained from the local institutional review board.

Sample and Setting

In the parent study, participants were eligible if they: 1) were U.S. military veterans post orthopedic surgeries requiring ≥6 weeks of rehabilitation; and 2) had elevated preoperative risk of persistent postsurgical pain (i.e., severe preoperative pain with a score of ≥7 on a 0-10 scale, or greater than mild pain with a score of ≥3 on a 0-10 scale along with elevated anxiety or depression as assessed by Hamilton Anxiety and Depression Scales [Hamilton, 1959, 1986]). Participants were recruited from a single orthopedic clinic in the parent study,

Measures

PROMIS^® Anxiety Short Forms

The PROMIS^® Anxiety Short Forms assess fear, anxious misery, hyperarousal, and somatic symptoms related to arousal. Each item is scored on a 1–5 Likert rating format (never, rarely, sometimes, often, and always) for the past seven days. Data for the PROMIS^® Anxiety Short Form 6a (the first six items of 8a) were extracted from the PROMIS^® Anxiety Short Form 8a that was originally collected in the parent study to estimate internal consistency. In this study, estimates of internal consistency were more than adequate in the 8a (Cronbach alpha=0.93) (Hadlandsmyth et al., 2020) with potential consideration of item redundancy, very good in the 6a (Cronbach alpha=0.88), and respectable in the 4a (Cronbach alpha=0.79). We selected the Short Form 6a for comparison with the GAD-7 because 1) it had the most desirable estimate of internal consistency among the three short forms and 2) items from the other two scales (8a and 4a) are nested in the Short Form 6a.

GAD-7

In the GAD-7, each item is scored on a 0-3 rating format (not at all, several days, more than half the days, and nearly every day) for the past two weeks. Total scale score ranges from 0 to 21. In this study, the GAD-7 showed very good estimates of internal consistency (Cronbach alpha=0.90) and good concurrent validity based on strong correlation with the PROMIS^® short form 8a (r=0.69, 95% CI=0.53, 0.79) (Hadlandsmyth et al., 2020).

Data Analysis

Distribution of item responses and total scale scores were examined using descriptive statistics using SPSS 26.0 (SPSS, Chicago, IL). Graded Response Models, which are two-parameter IRT models for ordered polytomous responses, were estimated using Weighted Least Squares with Mean and Variance (WLSMV), an estimation method appropriate for categorical data, using Mplus Version 7.1. (Muthén & Muthén, 2010). The two-parameter model has three assumptions: 1) unidimensionality (items in a (sub)scale measure only one single underlying construct), 2) local independence (item responses are uncorrelated after taking the level on the latent construct into account), and 3) minimal guessing in selecting response options (Rasch, 1960). The two-parameter model is a more appropriate model for the data than the one-parameter model, because unlike the one-parameter model that assumes constant item discrimination across response options, the two-parameter model allows varying discriminations across response options for each item. In this study, multiple IRT-based model parameters [model fit, item difficulties (locations), item discrimination estimates], as well as IRT-based graphs [item characteristic curves (ICCs), item information curves, test information curve] (Table 1) were compared between the PROMIS^® Anxiety Short Form 6a and GAD-7.

Table 1.

Definitions of IRT-based Model Parameters and Graphs

IRT-based model parameters	Definitions
Item difficulties (locations)	An IRT-based model estimate representing the level of the attributes being measured that is associated with a transition from “failing” or “not endorsing” to “passing” or “endorsing” a certain response category of one item. The difficulty of a specific response category of one item is the amount of strength the respondent must possess to pass or endorse that response category or transition from the low level to the next higher level of the scoring options of that item.
Model fit	A set of model fit indices estimating the fit of data to an IRT model. These indices include chi[ISP]-square goodness-of-fit index, Root Mean Square Error of Approximation (RMSEA), Comparative Fit Index (CFI), Tucker-Lewis Index (TLI), and Weighted Root Mean Square Residual (WRMR).
Item discrimination	An IRT-based model estimate representing the degree to which an item unambiguously classifies a response as a pass (endorse) or fail (not endorse). For an item with good item discrimination, the probability that different persons with the same level of an attribute would be discriminated the same on the latent continuum using the item.
IRT-based graphs
Item characteristic curves (ICCs)	A set of graphs representing the logistic function built upon model parameters demonstrating the relationship between a person’s level on the latent construct measured by the scale and the person’s probability of endorsing a certain response for each item.
Item information curves	A set of graphs representing the amount of information the response options of each item potentially contribute to decrease the uncertainty of a person’s level on the latent construct independent of the other items in the scale.
Test information curve	A graph representing a sum of information provided by all items of each test and demonstrates the range over the latent construct continuum where each scale performs the best in discriminating individuals.

Open in a new tab

Multiple model fit indices, including the chi-square goodness-of-fit index, Root Mean Square Error of Approximation (RMSEA), Comparative Fit Index (CFI), Tucker-Lewis Index (TLI), and Weighted Root Mean Square Residual (WRMR), were used to determine how well the graded response model fit the data in this study. A non-significant chi-square test indicates good fit (Kline, 2011). A RMSEA value of .08 or less indicate reasonable errors of approximation, whereas a RMSEA value between .08 and .10 indicates mediocre fit (Byrne, 2012; Hu & Bentler, 1999). The CFI and TLI values above .95 suggest an acceptable fit (Hu & Bentler, 1999; Kline, 2011). A WRMR of less than 1.0 represents good model fit for categorical data (Byrne, 2012).

Item difficulties represent the level of the attributes being measured that is associated with a transition from “failing” or “not endorsing” to “passing” or “endorsing” a certain response category of one item (de Ayala, 2009; DeVellis, 2016). For an item with ≥ 2 response categories, the difficulty estimate between the first two response categories is expected to be higher than that between the second and third response categories, and so on (de Ayala, 2009).

Item discrimination estimates represent the degree to which an item unambiguously classifies a response category as a pass (endorse) or fail (not endorse) and was estimated using the slope value of an ICC curve (de Ayala, 2009; DeVellis, 2016). The steeper the slope, the better the discrimination of the item; on the other hand, a flat or even negative slope represents problematic item discrimination (de Ayala, 2009; DeVellis, 2016).

ICCs are graphs of the logistic function built upon model parameters and demonstrate the relationship between a person’s level on the latent construct measured by the scale and the person’s probability of endorsing a certain response for each item. Item information curve represents the amount of information the response options of each item potentially contribute in decreasing the uncertainty of a person’s level on the latent construct independent of the other items in the scale. The test information curve represents a sum of information provided by all items of each scale and demonstrates the range over the latent construct continuum where each scale performs the best in discriminating individuals.

Results

Participant characteristics

A total of 67 participants (mean age=63.7, SD=8.9, range=34-84) who completed the self-report anxiety measures at 3 months following orthopedic surgery and were included in this study. Most participants were ≥65 years old (61.2%), male (93%), Caucasian (85%), married (57%), not currently receiving opioids (69%), and had undergone total knee arthroplasty (67%) and an income of ≤US $40,000/year (56%).

Distribution of Response Options and Total Scores

Participant self-reported anxiety was slightly skewed toward low-to-moderate levels as measured by the PROMIS^® Anxiety Short Form 6a (Online Supplementary Table 1) and the GAD-7 (Online Supplementary Table 2). Most participants (74.6%-91.0%) reported “never” or “rarely” feeling anxiety, and none endorsed “always” in experiencing anxiety in the past 7 days as measured by the PROMIS^® Anxiety Short Form 6a. Similarly, most participants (92.5%-100%) did not experience anxiety or only experienced anxiety for several days in the past 2 weeks as measured by the GAD-7. The low-to-moderate level of anxiety can be reflected through the distribution of the total scores. Using the PROMIS^® Anxiety Short Form 6a, 26.9% of the participants scored 6 (lowest total score of 6a), 47.4% scored 7-12, 22.4% scored 13-18, and 3% scored 19-22. Similarly, using GAD-7, 31.3% of the participants scored 0 (the lowest total score of GAD-7), 46.3% scored 1-6, 19.4% scored 7-13, and 3% scored 14-21.

Comparison of Model Estimates

Model Fit

For the PROMIS^® Anxiety Short Form 6a, the graded response model produced desirable model fit results, as indicated by a non-significant chi-square value [X² (df) =13.81(9), p= 0.13], low RMSEA (0.089) and WRMR (0.481) values, and high CFI (0.997) and TLI (0.995) values (Table 2). For GAD-7, the graded response model also produced very good model fit, supported by a non-significant chi-square value [X² (df) =17.95 (14), p=0.21], low RMSEA (0.065) and WRMR (0.460) values, and high CFI (0.996) and TLI (0.994) values (Table 3).

Table 2.

IRT-based Model Parameters for the PROMIS^® Anxiety Short Form 6a

items	Std. a	Std. δ^#
items	Std. a	Step 1	Step 2	Step 3
1. I felt fearful	0.68	−0.09	1.53	2.76
2. I found it hard to focus on anything other than my anxiety	0.90	0.07	1.09	2.41
3. My worries overwhelmed me	0.84	0.33	1.60
4. I felt uneasy	0.97	−0.37	0.78	1.94
5. I felt nervous	0.96	−0.46	0.69	2.26
6. I felt like I needed help for my anxiety	0.85	0.38	1.15	2.21

Open in a new tab

Note. N=67 participants. Std. a = standardized item discrimination. Std. δ = standardized location (difficulty) parameter. Cronbach alpha = 0.88. Chi-Square Test of Model Fit (degree of freedom) = 13.81(9), p= 0.13. RMSEA (Root Mean Square Error of Approximation) = 0.089. CFI (Comparative Fit Index) = 0.997. TLI (Tucker-Lewis Index) = 0.995. WRMR (Weighted Root Mean Square Residual) = 0.481

There are no item discrimination values for Step 4.

Table 3.

IRT-based Model Parameters for the Generalized Anxiety Disorder 7-Item Scale

items	Std. a	Std. δ^#
items	Std. a	Step 1	Step 2	Step 1
1. Feeling nervous, anxious, or on edge	0.93	0.31	1.35	1.67
2. Not being able to stop or control worrying	0.93	0.48	1.55	1.83
3. Worrying too much about different things	0.91	0.36	1.58	2.07
4. Trouble relaxing	0.86	−0.07	1.00	1.68
5. Being so restless that it’s hard to sit still	0.84	0.73	1.50	1.85
6. Becoming easily annoyed or irritable	0.74	0.23	1.59	2.10
7. Feeling afraid as if something awful might happen	0.78	0.91	2.41

Open in a new tab

Note. N=67 participants. Std. a = standardized item discrimination. Std. δ = standardized location (difficulty) parameter. Cronbach alpha = 0.90. Chi-Square Test of Model Fit (degree of freedom) =17.95(14), p=0.21. RMSEA (Root Mean Square Error of Approximation) =0.065. CFI (Comparative Fit Index) =0.996. TLI (Tucker-Lewis Index) =0.994. WRMR (Weighted Root Mean Square Residual) =0.460.

Item Difficulties (Locations)

Standardized item locations are displayed for the PROMIS Anxiety Short Form 6a and GAD-7 (Table 2, Table 3, Online Supplementary Figure 1). The range of item locations along the continuum of the anxiety construct was wider in the 6a (−0.46 to 2.76) than that in the GAD-7 (−0.07 to 2.10). Item difficulty estimates in the 6a distributed evenly along the continuum, indicating the level of difficulty represented by the response categories were distinct enough and the items performed well in distinguishing participants at specific points on the latent anxiety trait. In the GAD-7, item difficulty estimates for step 2 and step 3 were located very closely, indicating the last two response categories may not be distinct enough to be separate categories.

Item Discrimination

All items in the 6a (range: 0.68-0.97) and GAD-7 (range: 0.74-0.93) had positive discrimination parameters. This information indicated all items in the two measures had good ability in locating participants on the anxiety latent trait (Table 2, Table 3).

Item Characteristic Curves (ICCs)

The locations and shapes of ICCs varied depending on parameters of item difficulties and item discriminations. For 6a, the ICCs of all items demonstrated good discriminations of items and response options (Online Supplementary Figure 2). The curves for step 4 of all items were not concisely estimated due to lack of responses endorsing “always” for each item in the study sample. For the GAD-7, the third category of the first six items showed curves with very low peaks and were not the most likely response for any location across the latent trait (Online Supplementary Figure 3). This is probably due to the similar frequency of responses for the third response option (“more than half the days”) and the last response option (“nearly every day”) among participants. This information indicates the potential need of revision at the response option level, such as combining the last two response options (“more than half the days” and “nearly every day”) into one or making the 3^rd response (“more than half the days”) more distinct from the 2^nd (“several days”) and 4^th (“nearly every day”) response options.

Item Information Curves

In 6a, Information for item 1 (“I felt fearful”) was lowest, providing information across a relatively limited range of the latent trait. Information for item 2 (“I found it hard to focus on anything other than my anxiety”), item 3 (“my worries overwhelmed me”), and item 6 (“I felt like I needed help for my anxiety”) was moderate, and information for item 4 (“I felt uneasy”) and item 5 (“I felt nervous”) was highest, indicating different amounts of information were contributed by each item (Online Supplementary Figure 4).

In the GAD-7, information for item 6 (“becoming easily annoyed or irritable”) and item 7 (“feeling afraid as if something awful might happen”) was lowest. Information for item 3 (“worrying too much about different things”), item 4 (“trouble relaxing”), and item 5 (“being so restless that it’s hard to sit still”) was moderate. Information for item 1 (“feeling nervous, anxious, or on edge”) and item 2 (“not being able to stop or control worrying”) was highest (Online Supplementary Figure 4).

Test Information Curve

As a whole measure, the PROMIS Anxiety Short Form 6a (approximately a value of 12 on the amount of information) contributed comparatively more information than the GAD-7 (approximately a value of 10 on the amount of information) in determining the level of anxiety (Online Supplementary Figure 5). For both the 6a and GAD-7, the test information curves were unimodal and close to symmetric around 0.5 and provided maximum amount of information between −0.5 to 1.5 on the latent trait. This information indicates items in the two measures performed better in participants with average to moderate levels of anxiety, compared to those participants with very low or high levels of anxiety in this study. In addtion, information at the high end of the curves in both measures were not estimated, probably due to the low-to-moderate levels of anxiety in the sample and very few participants endoresing high levels of anxiety.

Discussion

Prior research has used IRT modeling to examine PROMIS^® Anxiety Short Forms in ethnically diverse subgroups (Teresi et al., 2016) and the GAD-7 in primary care patients (Jordan et al., 2017) separately, rather than in the same patient population. This study is the first comparing psychometric properties of the PROMIS^® Anxiety Short Form 6a and GAD-7 using two-parameter IRT models in the same veteran sample following orthopedic surgery. Notably, while the PROMIS^® Anxiety Short Form 6a is designed to assess anxiety during the past seven days and the GAD-7 is designed to assess anxiety during the past two weeks, the level of anxiety captured by each measure was similar in the study sample.

Findings of this study show that the psychometric performance of both the 6a and GAD-7 are comparably desirable based on comparison of four IRT-based estimates: 1) model fit: data from both measures have good fit to the models; 2) item discrimination: all items in both 6a and GAD-7 had good discrimination; 3) item information function: the curves for both measures seemed appropriate and different items represented different levels of information in determining anxiety levels; and 4) test information function: both measures had unimodal curves and performed the best for individuals with average-to-moderate levels of anxiety. These findings were consistent with prior reports that PROMIS Anxiety Short Forms and the GAD-7 show unidimensionality and decent fit to IRT models in different adult patients (Jordan et al., 2017; Teresi et al., 2016).

In addition, this study provided preliminary evidence that the 6a may be preferable to the GAD-7 in distinguishing veterans following orthopedic surgery on anxiety, especially those with low-to-moderate levels of anxiety. The preliminary evidence is based on comparison of two IRT-based parameters: 1) item difficulties: the estimates in the 6a had a wider range and distributed more evenly along the anxiety continuum than those in GAD-7; and 2) ICCs: all response categories seem useful in the 6a, while the third categories of most items in the GAD-7 were not useful in discriminating individuals with levels of anxiety.

This study has some implications for future research. Findings of this study point out an interesting signal for future examination and accumulation of evidence on the use of the PROMIS^® Anxiety Short Form 6a as potentially preferable to the GAD-7 to measure anxiety in Veterans following orthopedic surgery, especially those with low-to-moderate levels of anxiety. In this study, while the GAD-7 shows good estimates for multiple psychometric properties (i.e., internal consistency, concurrent validity, model fit, item discrimination, and item and test information function), it demonstrates some potential problems related to the usefulness of the third response categories, indicating the GAD-7 as a whole scale may not have adequate discrimination for individuals with low-to-moderate levels of anxiety. Possible reasons could be that the GAD-7 was developed based on CTT that focused primarily on the performance of the whole scale, rather than the performance of individual items. Comparatively, the PROMIS^® Anxiety Short Form 6a has good model fit and reliability as a whole scale, as well as items with good discrimination, evenly distributed item difficulties, and desirable characteristics curves (ICCs), making it a potentially preferable tool for assessing anxiety in veterans following orthopedic surgery. Therefore, the 6a may be recommended for use to measure generalized anxiety in future research. While further examination and comparison of the two measures using larger diverse samples is needed, this recommendation has the added benefit of providing a common data element that can be combined with data from other studies through the NIH repository resulting in a large dataset for further analyses to address research inquires related to the PROMIS^® Anxiety Short Form 6a and anxiety.

Findings also provided preliminary indicators for potential revision for the scales at the item and response option levels. Some response options of certain items in GAD-7 were not the most likely response to be endorsed by participants for any location across the latent trait, indicating the need of deletion or combination of response options. For example, findings indicated the consideration of combining “more than half the days” and “nearly every day” for the GAD-7 items. Another consideration might be revising “more than half the days” so it is more distinct from “several days” and “nearly every day”. This may be related to the use of a small study sample with lower levels of anxiety resulting in limited responses endorsing certain options. Future testing is needed in larger, diverse samples to guide decisions on the revisions of items and response options.

This study has some implications for clinical practice. While the GAD-7 has been most commonly used by clinicians as the gold standard to screen and identify GAD, this study provides preliminary psychometric evidence to support the use of the PROMIS^® Anxiety Short Form 6a as a preferred alternative. Due to problematic characteristic curves of certain response categories, the GAD-7 has potential issues in determining anxiety levels in a consistent way, a potential measurement bias that may lead to error in screening and diagnosing GAD. While prior research showed that the 27-item set, combining the GAD-7 and PROMIS^® Anxiety item bank, had evidence of unidimensionality and good model fit to assess anxiety (Schalet et al., 2014), the large number of items can easily result in response burden. Comparatively, the PROMIS^® Anxiety Short Form 6a has very good discrimination to distinguish veterans consistently across the latent trait of anxiety. In addition, the 6a is simple and easy to incorporate into clinical assessment and does not add patient burden in providing responses.

This study has some limitations. The use of a small sample of U.S. Veterans following orthopedic surgery with low-to-moderate self-reported anxiety may influence model estimates. The low-to-moderate anxiety in the sample may be related to the facts that 1) anxiety were assessed at 3 months after surgery when participants’ anxiety could be decreased due to completion of and recovery from surgery, and 2) half of the sample were assigned to the intervention group and received Acceptance and Commitment Therapy which has been shown effective in reducing anxiety (Swain et al., 2013). Therefore, the findings may only be generalized to U.S. Veterans following orthopedic surgery with low-to-moderate self-reported anxiety.

This study points out two directions for future research in testing and comparing psychometric properties of the PROMIS^® Anxiety Short Form 6a and GAD-7 among U.S. Veterans. First, future testing is needed among larger diverse samples, especially those with moderate-to-high levels of anxiety, including veterans prior to, right after, and/or less than 3-month post-surgery, as well as veterans who did not receive treatment for anxiety. Second, future research may assess anxiety using different PROMIS^® Anxiety Short Forms (i.e., 8a, 6a, 7a, and 4a) separately to compare psychometric performance of the PROMIS^® Anxiety Short Forms and GAD-7.

In conclusion, while 6a and GAD-7 are comparable in most IRT-based estimates, this study provided preliminary evidence on the use of the 6a as preferable to the GAD-7 to measure anxiety in veterans following orthopedic surgery. Future testing is needed through collecting data on larger, diverse samples with higher levels of anxiety.

Supplementary Material

Online_Supplementary_tables_&_figures

NIHMS1702072-supplement-Online_Supplementary_tables___figures.docx^{(704KB, docx)}

Acknowledgement:

Data used in this study were collected from a pilot RCT (NIH/NINR R34 AT008349-01, Preventing Persistent Post-Surgical Pain & Opioid Use in Veterans: Effect of ACT, PI: Rakel. ClinicalTrials.gov Identifier: NCT02437188. The sponsor was not involved in study design, data collection/analysis, interpretation of findings, and manuscript preparation.

Funding support:

None.

Footnotes

Conflicts of interest: We have no conflict of interest to declare.

Submission declaration and verification: The work described has not been published previously, that it is not under consideration for publication elsewhere, that its publication is approved by all authors and tacitly or explicitly by the responsible authorities where the work was carried out, and that, if accepted, it will not be published elsewhere in the same form, in English or in any other language, including electronically without written consent of the copyright-holder.

References

Allen MJ, & Yen WM (2001). Introduction to measurement theory. Waveland Press. [Google Scholar]
Beleckas CM, Prather H, Guattery J, Wright M, Kelly M, & Calfee RP (2018). Anxiety in the orthopedic patient: Using PROMIS to assess mental health. Quality of Life Research, 27(9), 2275–2282. 10.1007/s11136-018-1867-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
Byrne BM (2012). Structural equation modeling with Mplus: Basic concepts, applications, and programming. Routledge. [Google Scholar]
de Ayala RJ (2009). The theory and practice of item response theory. The Guilford Press. [Google Scholar]
DeVellis RF (2016). Scale development: Theory and applications. SAGE. [Google Scholar]
Dindo L, Zimmerman MB, Hadlandsmyth K, St.Marie B, Embree J, Marchman J, Tripp-Reimer T, & Rakel B (2018). Acceptance and commitment therapy for prevention of chronic postsurgical pain and opioid use in at-risk veterans: A pilot randomized controlled study. The Journal of Pain, 19(10), 1211–1221. 10.1016/j.jpain.2018.04.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
Driban JB, Morgan N, Price LL, Cook KF, & Wang C (2015). Patient-Reported Outcomes Measurement Information System (PROMIS) instruments among individuals with symptomatic knee osteoarthritis: A cross-sectional study of floor/ceiling effects and construct validity. BMC Musculoskeletal Disorders, 16(1), 1–9. 10.1186/s12891-015-0715-y [DOI] [PMC free article] [PubMed] [Google Scholar]
Hadlandsmyth K, Dindo LN, St. Marie BJ, Wajid R, Embree JL, Noiseux NO, Tripp-Reimer T, Zimmerman MB, & Rakel BA (2020). Patient-Reported Outcomes Measurement Information System (PROMIS) instruments: Reliability and validity in veterans following orthopedic surgery. Evaluation & the Health Professions, 43(4), 207–212. 10.1177/0163278719856406 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hamilton M (1959). The assessment of anxiety states by rating. British Journal of Medical Psychology, 32(1), 50–55. 10.1111/j.2044-8341.1959.tb00467.x [DOI] [PubMed] [Google Scholar]
Hamilton M (1986). The Hamilton rating scale for depression. Springer. [Google Scholar]
Hu L, & Bentler PM (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55. 10.1080/10705519909540118 [DOI] [Google Scholar]
Jordan P, Shedden-Mora MC, & Löwe B (2017). Psychometric analysis of the Generalized Anxiety Disorder scale (GAD-7) in primary care using modern item response theory. PloS One, 12(8), e0182162. 10.1371/journal.pone.0182162 [DOI] [PMC free article] [PubMed] [Google Scholar]
Kline RB (2011). Principles and practice of structural equation modeling. Guilford Press. [Google Scholar]
Lee AC, Driban JB, Price LL, Harvey WF, Rodday AM, & Wang C (2017). Responsiveness and minimally important differences for 4 patient-reported outcomes measurement information system short forms: Physical function, pain interference, depression, and anxiety in knee osteoarthritis. The Journal of Pain, 18(9), 1096–1110. 10.1016/j.jpain.2017.05.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
Lewis G, Rice D, McNair P, & Kluger M (2015). Predictors of persistent pain after total knee arthroplasty: A systematic review and meta-analysis. British Journal of Anaesthesia, 114(4), 551–561. 10.1093/bja/aeu441 [DOI] [PubMed] [Google Scholar]
Marrie RA, Zhang L, Lix LM, Graff LA, Walker JR, Fisk JD, Patten SB, Hitchon CA, Bolton JM, & Sareen J (2018). The validity and reliability of screening measures for depression and anxiety disorders in multiple sclerosis. Multiple Sclerosis and Related Disorders, 20, 9–15. 10.1016/j.msard.2017.12.007 [DOI] [PubMed] [Google Scholar]
Muthén LK, & Muthén BO (2010). Mplus user’s guide Author. Los Angeles, CA. [Google Scholar]
National Institute of Health. (2019). Patient-Reported Outcomes Measurement Informaiton System (PROMIS): Dynamic Tools to measurement health outcomes from the patient perspective: A brief guide to the PROMIS® Anxiety Instrument. http://www.healthmeasures.net/images/PROMIS/manuals/PROMIS_Anxiety_Scoring_Manual.pdf. [Google Scholar]
Noiseux NO, Callaghan JJ, Clark CR, Zimmerman MB, Sluka KA, & Rakel BA (2014). Preoperative predictors of pain following total knee arthroplasty. The Journal of Arthroplasty, 29(7), 1383–1387. 10.1016/j.arth.2014.01.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
Purvis TE, Neuman BJ, Riley LH III, & Skolasky RL (2018). Discriminant ability, concurrent validity, and responsiveness of PROMIS health domains among patients with lumbar degenerative disease undergoing decompression with or without arthrodesis. Spine, 43(21), 1512–1520. 10.1097/BRS.0000000000002661 [DOI] [PubMed] [Google Scholar]
Purvis TE, Neuman BJ, Riley LH, & Skolasky RL (2019). Comparison of PROMIS Anxiety and Depression, PHQ-8, and GAD-7 to screen for anxiety and depression among patients presenting for spine surgery. Journal of Neurosurgery: Spine, 30(4), 524–531. 10.3171/2018.9.SPINE18521 [DOI] [PubMed] [Google Scholar]
Rasch G (1960). Probabilistic models for some intelligence and attainment tests. University of Chicago Press. [Google Scholar]
Rutter LA, & Brown TA (2017). Psychometric properties of the generalized anxiety disorder scale-7 (GAD-7) in outpatients with anxiety and mood disorders. Journal of Psychopathology and Behavioral Assessment, 39(1), 140–146. 10.1007/s10862-016-9571-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
Schalet BD, Cook KF, Choi SW, & Cella D (2014). Establishing a common metric for self-reported anxiety: Linking the MASQ, PANAS, and GAD-7 to PROMIS Anxiety. Journal of Anxiety Disorders, 28(1), 88–96. 10.1016/j.janxdis.2013.11.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
Spitzer RL, Kroenke K, Williams JB, & Löwe B (2006). A brief measure for assessing generalized anxiety disorder: The GAD-7. Archives of Internal Medicine, 166(10), 1092–1097. 10.1001/archinte.166.10.1092 [DOI] [PubMed] [Google Scholar]
Sunderland M, Batterham P, Calear A, & Carragher N (2018). Validity of the PROMIS depression and anxiety common metrics in an online sample of Australian adults. Quality of Life Research, 27(9), 2453–2458. 10.1007/s11136-018-1905-5 [DOI] [PubMed] [Google Scholar]
Swain J, Hancock K, Hainsworth C, & Bowman J (2013). Acceptance and commitment therapy in the treatment of anxiety: A systematic review. Clinical Psychology Review, 33(8), 965–978. 10.1016/j.cpr.2013.07.002 [DOI] [PubMed] [Google Scholar]
Teresi JA, Ocepek-Welikson K, Kleinman M, Ramirez M, & Kim G (2016). Measurement equivalence of the Patient Reported Outcomes Measurement Information System®(PROMIS®) Anxiety short forms in ethnically diverse groups. Psychological Test and Assessment Modeling, 58(1), 183–219. [PMC free article] [PubMed] [Google Scholar]
Toussaint A, Hüsing P, Gumz A, Wingenfeld K, Härter M, Schramm E, & Löwe B (2020). Sensitivity to change and minimal clinically important difference of the 7-item Generalized Anxiety Disorder Questionnaire (GAD-7). Journal of affective disorders, 265, 395–401. 10.1016/j.jad.2020.01.032 [DOI] [PubMed] [Google Scholar]
Tully PJ, & Baker RA (2012). Depression, anxiety, and cardiac morbidity outcomes after coronary artery bypass surgery: A contemporary and practical review. Journal of geriatric cardiology, 9(2), 197–208. 10.3724/SP.J.1263.2011.12221 [DOI] [PMC free article] [PubMed] [Google Scholar]
Tuma AH, & Maser JD (2019). Anxiety and the anxiety disorders. Routledge. [Google Scholar]
Waltz CF, Strickland OL, & Lenz ER (2010). Measurement in nursing and health research. Springer. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Online_Supplementary_tables_&_figures

NIHMS1702072-supplement-Online_Supplementary_tables___figures.docx^{(704KB, docx)}

[R1] Allen MJ, & Yen WM (2001). Introduction to measurement theory. Waveland Press. [Google Scholar]

[R2] Beleckas CM, Prather H, Guattery J, Wright M, Kelly M, & Calfee RP (2018). Anxiety in the orthopedic patient: Using PROMIS to assess mental health. Quality of Life Research, 27(9), 2275–2282. 10.1007/s11136-018-1867-7 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] Byrne BM (2012). Structural equation modeling with Mplus: Basic concepts, applications, and programming. Routledge. [Google Scholar]

[R4] de Ayala RJ (2009). The theory and practice of item response theory. The Guilford Press. [Google Scholar]

[R5] DeVellis RF (2016). Scale development: Theory and applications. SAGE. [Google Scholar]

[R6] Dindo L, Zimmerman MB, Hadlandsmyth K, St.Marie B, Embree J, Marchman J, Tripp-Reimer T, & Rakel B (2018). Acceptance and commitment therapy for prevention of chronic postsurgical pain and opioid use in at-risk veterans: A pilot randomized controlled study. The Journal of Pain, 19(10), 1211–1221. 10.1016/j.jpain.2018.04.016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Driban JB, Morgan N, Price LL, Cook KF, & Wang C (2015). Patient-Reported Outcomes Measurement Information System (PROMIS) instruments among individuals with symptomatic knee osteoarthritis: A cross-sectional study of floor/ceiling effects and construct validity. BMC Musculoskeletal Disorders, 16(1), 1–9. 10.1186/s12891-015-0715-y [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Hadlandsmyth K, Dindo LN, St. Marie BJ, Wajid R, Embree JL, Noiseux NO, Tripp-Reimer T, Zimmerman MB, & Rakel BA (2020). Patient-Reported Outcomes Measurement Information System (PROMIS) instruments: Reliability and validity in veterans following orthopedic surgery. Evaluation & the Health Professions, 43(4), 207–212. 10.1177/0163278719856406 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Hamilton M (1959). The assessment of anxiety states by rating. British Journal of Medical Psychology, 32(1), 50–55. 10.1111/j.2044-8341.1959.tb00467.x [DOI] [PubMed] [Google Scholar]

[R10] Hamilton M (1986). The Hamilton rating scale for depression. Springer. [Google Scholar]

[R11] Hu L, & Bentler PM (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55. 10.1080/10705519909540118 [DOI] [Google Scholar]

[R12] Jordan P, Shedden-Mora MC, & Löwe B (2017). Psychometric analysis of the Generalized Anxiety Disorder scale (GAD-7) in primary care using modern item response theory. PloS One, 12(8), e0182162. 10.1371/journal.pone.0182162 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Kline RB (2011). Principles and practice of structural equation modeling. Guilford Press. [Google Scholar]

[R14] Lee AC, Driban JB, Price LL, Harvey WF, Rodday AM, & Wang C (2017). Responsiveness and minimally important differences for 4 patient-reported outcomes measurement information system short forms: Physical function, pain interference, depression, and anxiety in knee osteoarthritis. The Journal of Pain, 18(9), 1096–1110. 10.1016/j.jpain.2017.05.001 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] Lewis G, Rice D, McNair P, & Kluger M (2015). Predictors of persistent pain after total knee arthroplasty: A systematic review and meta-analysis. British Journal of Anaesthesia, 114(4), 551–561. 10.1093/bja/aeu441 [DOI] [PubMed] [Google Scholar]

[R16] Marrie RA, Zhang L, Lix LM, Graff LA, Walker JR, Fisk JD, Patten SB, Hitchon CA, Bolton JM, & Sareen J (2018). The validity and reliability of screening measures for depression and anxiety disorders in multiple sclerosis. Multiple Sclerosis and Related Disorders, 20, 9–15. 10.1016/j.msard.2017.12.007 [DOI] [PubMed] [Google Scholar]

[R17] Muthén LK, & Muthén BO (2010). Mplus user’s guide Author. Los Angeles, CA. [Google Scholar]

[R18] National Institute of Health. (2019). Patient-Reported Outcomes Measurement Informaiton System (PROMIS): Dynamic Tools to measurement health outcomes from the patient perspective: A brief guide to the PROMIS® Anxiety Instrument. http://www.healthmeasures.net/images/PROMIS/manuals/PROMIS_Anxiety_Scoring_Manual.pdf. [Google Scholar]

[R19] Noiseux NO, Callaghan JJ, Clark CR, Zimmerman MB, Sluka KA, & Rakel BA (2014). Preoperative predictors of pain following total knee arthroplasty. The Journal of Arthroplasty, 29(7), 1383–1387. 10.1016/j.arth.2014.01.034 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Purvis TE, Neuman BJ, Riley LH III, & Skolasky RL (2018). Discriminant ability, concurrent validity, and responsiveness of PROMIS health domains among patients with lumbar degenerative disease undergoing decompression with or without arthrodesis. Spine, 43(21), 1512–1520. 10.1097/BRS.0000000000002661 [DOI] [PubMed] [Google Scholar]

[R21] Purvis TE, Neuman BJ, Riley LH, & Skolasky RL (2019). Comparison of PROMIS Anxiety and Depression, PHQ-8, and GAD-7 to screen for anxiety and depression among patients presenting for spine surgery. Journal of Neurosurgery: Spine, 30(4), 524–531. 10.3171/2018.9.SPINE18521 [DOI] [PubMed] [Google Scholar]

[R22] Rasch G (1960). Probabilistic models for some intelligence and attainment tests. University of Chicago Press. [Google Scholar]

[R23] Rutter LA, & Brown TA (2017). Psychometric properties of the generalized anxiety disorder scale-7 (GAD-7) in outpatients with anxiety and mood disorders. Journal of Psychopathology and Behavioral Assessment, 39(1), 140–146. 10.1007/s10862-016-9571-9 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] Schalet BD, Cook KF, Choi SW, & Cella D (2014). Establishing a common metric for self-reported anxiety: Linking the MASQ, PANAS, and GAD-7 to PROMIS Anxiety. Journal of Anxiety Disorders, 28(1), 88–96. 10.1016/j.janxdis.2013.11.006 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] Spitzer RL, Kroenke K, Williams JB, & Löwe B (2006). A brief measure for assessing generalized anxiety disorder: The GAD-7. Archives of Internal Medicine, 166(10), 1092–1097. 10.1001/archinte.166.10.1092 [DOI] [PubMed] [Google Scholar]

[R26] Sunderland M, Batterham P, Calear A, & Carragher N (2018). Validity of the PROMIS depression and anxiety common metrics in an online sample of Australian adults. Quality of Life Research, 27(9), 2453–2458. 10.1007/s11136-018-1905-5 [DOI] [PubMed] [Google Scholar]

[R27] Swain J, Hancock K, Hainsworth C, & Bowman J (2013). Acceptance and commitment therapy in the treatment of anxiety: A systematic review. Clinical Psychology Review, 33(8), 965–978. 10.1016/j.cpr.2013.07.002 [DOI] [PubMed] [Google Scholar]

[R28] Teresi JA, Ocepek-Welikson K, Kleinman M, Ramirez M, & Kim G (2016). Measurement equivalence of the Patient Reported Outcomes Measurement Information System®(PROMIS®) Anxiety short forms in ethnically diverse groups. Psychological Test and Assessment Modeling, 58(1), 183–219. [PMC free article] [PubMed] [Google Scholar]

[R29] Toussaint A, Hüsing P, Gumz A, Wingenfeld K, Härter M, Schramm E, & Löwe B (2020). Sensitivity to change and minimal clinically important difference of the 7-item Generalized Anxiety Disorder Questionnaire (GAD-7). Journal of affective disorders, 265, 395–401. 10.1016/j.jad.2020.01.032 [DOI] [PubMed] [Google Scholar]

[R30] Tully PJ, & Baker RA (2012). Depression, anxiety, and cardiac morbidity outcomes after coronary artery bypass surgery: A contemporary and practical review. Journal of geriatric cardiology, 9(2), 197–208. 10.3724/SP.J.1263.2011.12221 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] Tuma AH, & Maser JD (2019). Anxiety and the anxiety disorders. Routledge. [Google Scholar]

[R32] Waltz CF, Strickland OL, & Lenz ER (2010). Measurement in nursing and health research. Springer. [Google Scholar]

PERMALINK

Item Response Theory Analysis: PROMIS® Anxiety Form & Generalized Anxiety Disorder Scale

Wen Liu, PhD RN

Lilian Dindo, PhD

Katherine Hadlandsmyth, PhD

George Jay Unick, MSW, PhD

M Bridget Zimmerman, PhD

St Barbara Marie, PhD, AGPCNP, FAANP

Jennie Embree, MS

Toni Tripp-Reimer, PhD, RN, FAAN

Barbara Rakel, PhD, RN, FAAN

Roles

Abstract

PROMIS® Anxiety Short Forms vs. Generalized Anxiety Disorder 7-Item Scale (GAD-7)

Purpose

Methods

Study Design

Sample and Setting

Measures

PROMIS® Anxiety Short Forms

GAD-7

Data Analysis

Table 1.

Results

Participant characteristics

Distribution of Response Options and Total Scores

Comparison of Model Estimates

Model Fit

Table 2.

Table 3.

Item Difficulties (Locations)

Item Discrimination

Item Characteristic Curves (ICCs)

Item Information Curves

Test Information Curve

Discussion

Supplementary Material

Acknowledgement:

Funding support:

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

PROMIS^® Anxiety Short Forms vs. Generalized Anxiety Disorder 7-Item Scale (GAD-7)

PROMIS^® Anxiety Short Forms