Abstract
Background
Two item banks for substance use were developed as part of the Patient-Reported Outcomes Measurement Information System (PROMIS®): severity of substance use and positive appeal of substance use.
Methods
Qualitative item analysis (including focus groups, cognitive interviewing, expert review, and item revision) reduced an initial pool of more than 5,300 items for substance use to 119 items included in field testing. Items were written in a first-person, past-tense format, with 5 response options reflecting frequency or severity. Both 30-day and 3-month time frames were tested. The calibration sample of 1,336 respondents included 875 individuals from the general population (ascertained through an internet panel) and 461patients from addiction treatment centers participating in the National Drug Abuse Treatment Clinical Trials Network.
Results
Final banks of 37 and 18 items were calibrated for severity of substance use and positive appeal of substance use, respectively, using the two-parameter graded response model from item response theory (IRT). Initial calibrations were similar for the 30-day and 3-month time frames, and final calibrations used data combined across the time frames, making the items applicable with either interval. Seven-item static short forms were also developed from each item bank.
Conclusions
Test information curves showed that the PROMIS item banks provided substantial information in a broad range of severity, making them suitable for treatment, observational, and epidemiological research in both clinical and community settings.
Keywords: substance use, drug use, item response theory, measurement
1. Introduction
The Patient-Reported Outcomes Measurement Information System (PROMIS®) is designed to improve assessment of self-reported outcomes using state-of-the-art psychometric methods (see www.nihpromis.org). PROMIS item banks provide a comprehensive profile of health status, including physical functioning, pain, fatigue, sleep disturbance, emotional distress, alcohol use, and social participation (Buysse et al., 2010; Cella et al., 2010; Fries et al., 2009; Pilkonis et al., 2011, 2013; Revicki et al., 2009). PROMIS is the most ambitious attempt to date to apply models from item response theory (IRT) to health-related assessment (Cella et al., 2010; Hilton, 2011; Reeve et al., 2007). We report here on the development and calibration of two item banks measuring severity of substance use and the positive appeal of substance use, an important motivational factor influencing the development and treatment of substance use disorder (Cox et al., 2015; Dow and Kelly, 2013).
In the area of substance use, analyses relying on IRT have been used for multiple purposes: investigation of the dimensionality and informational value of diagnostic criteria for substance use disorders (Langenbucher et al., 2004; Lynskey and Agrawal, 2007; Saha et al., 2012); refinement of existing instruments (e.g., the Alcohol, Smoking and Substance Involvement Screening Test, ASSIST; the Global Appraisal of Individual Needs, GAIN), often to shorten them for purposes of more efficient screening (Ali et al., 2013; Stucky et al., 2014); and development of scales to assess constructs relevant to substance use, e.g., risk behavior related to injecting drugs, alexithymia (Hendryx et al., 1992; Janulis, 2014). The DSM-5 Substance-Related Disorders Work Group reviewed 39 IRT studies on the scaling and calibration of criteria for substance abuse and dependence. On the basis of this evidence, the Work Group eliminated the distinction between abuse and dependence in favor of a single category—substance use disorder—operationalized with 11 criteria (Hasin et al., 2013). The diagnostic criteria are sufficiently unidimensional for calibration with IRT models, but they are high-threshold items most relevant at the severe end of the continuum of substance use (Hartman et al., 2008; Langenbucher et al., 2004; Lynskey and Agrawal, 2007). Our goal was to identify not only items consistent with the diagnostic criteria but also items that were more normally distributed in a sample that included drug users in the community as well as drug users seeking treatment. Such items will provide more information across a broader range of the continuum of substance use. For this reason, they should constitute more sensitive measures of treatment outcome and result in a single metric of severity that can be used across treatment, observational, and epidemiological studies.
Drugs of abuse are too various to encourage the development of separate item banks for individual substances, especially in the PROMIS context where the mandate is to develop item banks for generic constructs relevant to all chronic diseases. Therefore, we took a broad approach and created items that referred to “drugs” in general. This approach is also consistent with the clinical epidemiology of many secondary and tertiary care facilities where patients appearing for treatment often abuse multiple substances (Stinson et al., 2005).
Substance use may vary over time as a function of availability of different drugs, resources for procuring drugs, periods of incarceration, and other life circumstances. Thus, an important issue is the preferred time frame for items. A shorter time frame (e.g., the past 30 days) is preferable while patients are in treatment and during the early stages of recovery when they are at high risk of relapse. Thus, a shorter period would be most suitable for clinical trials, treatment outcome studies, and comparative effectiveness studies where the priorities are repeated measurements and responsiveness to change over relatively brief periods. By contrast, a longer time frame (e.g., the past 3 months) allows aggregation of use over a more extended interval and may be more appropriate for observational or epidemiological studies. Rather than making an a priori decision about time frame, we chose to investigate empirically the psychometric implications of 30-day and 3-month time frames. Thus, the same item pool was administered to two different samples receiving one or the other of the two time frames, allowing examination of the performance of items under the two conditions.
2. Methods
2.1. Development of item pool
2.1.1. Comprehensive literature searches
We performed searches in the MEDLINE, PsycINFO, and Health and Psychosocial Instruments (HaPI) databases. Details of the methodology are reported in Klem et al. (2009). The searches generated 3,388 abstracts linked to more than 80 unique measures of substance use. Copies of the measures were gathered from both electronic and print sources, and measures were reviewed at the item level.
2.1.2. Conceptual organization of items
The initial substance use item pool contained over 5,300 items. We organized the items into conceptually meaningful categories using a hierarchical approach informed by previous empirical work (e.g., factor analyses) and clinical formulation. Prior work had divided substance use items into domains relevant both to the DSM categorization of substance use disorders, e.g., syndromal indicators of dependence, negative consequences associated with substance use (Green et al., 2011; Krueger et al., 2004; Muthén, 2006; Saha et al., 2007), and to broader themes, e.g., motivations for drug use, drug-seeking behaviors (Jones et al., 2001; Korcha et al., 2011; Pabst et al., 2009). Our hierarchical structure for substance use included five domains: motivations for use, intended effects of use, drug-seeking behaviors, patterns of consumption, and negative consequences of use. We also created 38 distinct facets across the 5 domains. For example, within the domain of negative consequences, we included separate facets for physical, mental, social, and legal consequences.
2.1.3. Focus Groups
To ensure comprehensive coverage of the substance use domain from the patient perspective, we conducted two focus groups and performed thematic analyses of the topics discussed (Castel et al., 2008; Kelly et al., 2011). Members of the groups (n = 15) were recruited from outpatient substance use treatment programs. The average age was 41 (SD = 9). Females (53%) and members of minority groups were well-represented (race = African American 53%, ethnicity = Hispanic 6%). Two-thirds of the participants (67%) had no formal education beyond high school.
Using semi-structured scripts, facilitators prompted participants to discuss their experiences with substance use and the characteristics of problematic consumption. Research staff reviewed process notes from the groups and audio recordings, paying close attention to appraisals of use (consequences of substance use, intended versus actual effects of use) and contexts and triggers of substance-related experiences. Participants discussed a variety of consequences of substance use: physical (e.g., pain, contracting sexually transmitted diseases), mental (e.g., decreased concentration, emotional distress), social (e.g., loss of friends and relationships), and financial (e.g., job loss, debt). In an era when legalization of marijuana for medical and recreational purposes is becoming more common, we were also attentive to perceived positive aspects of substance use. The appeal of substance use included both increasing positive emotions (e.g., feeling happy and social) and alleviating negative emotions (e.g., reducing depression and anxiety). The general goal was to enrich our item pool with content not always represented on traditional questionnaires. We were also eager to identify potential items with lower thresholds for endorsement (e.g., used only on weekends, used only in certain situations).
2.1.4 Cognitive interviews
Forty-three participants from outpatient substance use treatment programs were recruited for cognitive interviews. All items were reviewed by at least six individuals. An effort was made to include adequate numbers of women and members of minority groups (47% and 53%, respectively). Forty percent reported no formal education beyond high school. An interviewer met with participants and asked each to “think aloud” while responding to items. Participants were then prompted for feedback on the language and clarity of items and the relevance of the content. Of the 338 items reviewed by participants, 72% were acceptable as presented, 15% were re-written, and 14% were deleted. Of the re-written items, a majority (82%) required a revision to the item stem (content) versus the time frame or response options.
The cognitive interviews and continued revision of the items by members of the research team produced a pool of 119 items for general substance use. More details of the qualitative review process and its results are described in the supplement available online1.
2.2. Sampling
We administered the items to an internet sample of 875 participants from the general population and a clinical sample of 461 patients in treatment for substance use disorders in North Carolina, Oregon, and Pennsylvania. The treatment centers participated in the National Drug Abuse Treatment Clinical Trials Network (McCarty et al., 2008; Tai et al., 2010). The internet sample was collected through YouGov, a web-based polling firm in Palo Alto, CA. Eligible participants reported use of an illicit substance within either the past 30 days or the past 3 months in initial screening. This procedure resulted in two calibration samples: substance use in the past 30 days (n = 765; internet = 459, clinical = 306), and substance use in the past 3 months (n = 571; internet = 416, clinical = 155).
Demographic characteristics of the YouGov sample and the clinical sample (combined across all sites) are summarized in Table 1. Members of the clinical sample were younger (p < .001) and less well educated (p < .001) than participants from the community. There were racial differences between the two samples (p < .001), due largely to more African American participants and fewer white participants in the clinical sample. However, there were fewer Hispanic participants in the clinical sample (p < .001), and there was no difference in sex ratio.
Table 1.
Demographic Characteristics of the Calibration Sample
| Clinical Sample (n = 461) | Community Sample (n = 875) | |
|---|---|---|
|
| ||
| Characteristic | % | % |
| Sex | ||
| Male | 50.8 | 49.6 |
| Ethnicity | ||
| Hispanic | 5.4 | 13.1 |
| Race | ||
| American Indian/Alaska Native | 2.2 | 0.0 |
| Asian | 0.9 | 1.3 |
| Black/African American | 22.3 | 10.3 |
| White | 68.8 | 82.4 |
| Other/Multiracial | 5.9 | 6.1 |
| Education | ||
| High school diploma or less | 53.4 | 41.0 |
| Further educational attainment | 46.6 | 59.0 |
| Mean Age (SD) | 37.3 (11.4) | 44.6 (16.5) |
2.3. Measures
The substance use item pool used in field testing contained 119 items assessing motivations for use (17 items), intended effects of use (30 items), drug-seeking behaviors (9 items), patterns of consumption (25 items), and negative consequences of use (38 items). Participants also completed the Alcohol, Smoking and Substance Involvement Screening Test (ASSIST v3.0; WHO ASSIST Working Group, 2002), a commonly used “legacy” instrument. In addition, they reported the ages at which they first used alcohol and drugs, and they provided a medical history that asked about lifetime diagnoses and functional impairment in nine areas (e.g., cardiovascular problems, cancer, mental health).
2.4. Data Analysis
2.4.1. General strategy
Our major goal was to identify the most robust latent constructs within the item pool and to document sufficient unidimensionality for each of them to proceed with IRT analyses. First, we inspected frequency distributions of individual items for sparse cells. We then investigated dimensionality by dividing the samples randomly into two subsamples, one for exploratory factor analysis (unweighted least squares EFA, n = 276 for the 3-month time frame and n = 413 for the 30-day time frame) and the second for confirmatory factor analysis (CFA, n = 295 for the 3-month time frame and n = 352 for the 30-day time frame). The EFAs and CFAs were conducted using Mplus 6.1 with promax rotation (Muthén and Muthén, 2010). In the CFAs, the items were treated as categorical variables, and the robust weighted least squares estimator was used.
2.4.2. Item response theory analysis
The 2-parameter graded response model (GRM; Samejima, 1969) is the most commonly used IRT model for polytomous items. The GRM has a slope parameter and n − 1 threshold parameters for each item, where n is the number of response categories (5 in the present analyses). The slope parameter measures item discrimination, i.e., how well the item differentiates between higher versus lower levels of severity of substance use. Informative items have large slope parameters. Threshold parameters measure item difficulty, i.e., the ease versus difficulty of endorsing different response options for an item. Items were calibrated using IRTPRO v2.1 (Cai et al., 2011).
Differential item functioning (DIF) occurs when characteristics (e.g., age, gender, or ethnicity) which may seem extraneous to the assessment of the constructs under consideration affect measurement. An item is identified as functioning differentially if the item is more (or less) difficult to endorse or more (or less) discriminating in one group (e.g., men) compared to a reference group (e.g., women) when the different subgroups have been matched on the latent trait under investigation. With regard to demographic characteristics, we conducted DIF analyses (for both uniform and non-uniform DIF) on the basis of age, gender, and education (high school education or less versus further educational attainment). We focused on these variables because the relevant comparison groups were adequately represented. We also used a median split on age (less than 43 versus 43 or older) to compare younger and older respondents. Other potential comparison groups (e.g., white versus non-white respondents, Hispanic versus non-Hispanic respondents) were less equally divided. In addition, we conducted DIF analyses on the basis of severity of overall substance use (using a median split on the total substance involvement scores computed from the ASSIST). This analysis examined the performance of items based on differential exposure to drugs and documented their consistency across the full spectrum of substance use. Two different DIF procedures were employed—the IRT likelihood ratio method (Thissen et al., 1993) embedded in IRTPRO and an ordinal logistic regression procedure (Zumbo, 1999). Items were considered for removal if they showed significant DIF (p < .01) by both methods (Teresi et al., 2009).
3. RESULTS
3.1. Drug, alcohol, and tobacco use in the two samples
Table 2 summarizes the lifetime use of substances reported on the ASSIST. The most frequently used drugs in the clinical sample were cannabis (94%), cocaine (83%), and opioids (75%), and in the community sample, cannabis (77%), cocaine (38%), and hallucinogens (38%). For the past three months, the clinical sample reported using opioids (66%), cannabis (64%), and cocaine (40%) most frequently; in the community sample, the comparable percentages were cannabis (58%), sedatives (19%), and opioids (10%). The samples also differed in median age of first substance use (clinical sample = 14, community sample = 17) and the percentage using prior to age 15 (clinical sample = 50%, community sample = 26%), a commonly cited risk factor (Grant and Dawson, 1998; King and Chassin, 2007; Swendsen et al., 2009). The lifetime prevalence of injecting any drug was 39% in the clinical sample and 11% in the community sample (see Table 3).
Table 2.
Substance Use Reported on the ASSIST
| Ever Used | Used in Past 3 Months | |||
|---|---|---|---|---|
|
| ||||
| Clinical Sample (n = 461) | Community Sample (n = 875) | Clinical Sample (n = 461) | Community Sample (n = 875) | |
|
| ||||
| Substance | % | % | % | % |
| Cannabis | 93.5 | 76.5 | 64.2 | 57.9 |
| Cocaine | 83.1 | 38.1 | 39.8 | 7.1 |
| Opioids | 75.2 | 23.7 | 65.7 | 10.0 |
| Amphetamines | 58.8 | 36.2 | 30.6 | 8.3 |
| Hallucinogens | 50.7 | 38.1 | 4.6 | 4.5 |
| Sedatives | 50.1 | 35.0 | 31.7 | 19.0 |
| Inhalants | 14.6 | 11.0 | 1.5 | 2.6 |
| Tobacco | 92.0 | 75.1 | 86.3 | 45.4 |
| Alcohol | 94.1 | 95.6 | 72.2 | 85.4 |
Table 3.
High-Risk Behaviors and Medical Status
| Clinical Sample (n = 461) | Community Sample (n = 875) | |
|---|---|---|
|
| ||
| In the past 3 months, while using drugs…a | % | % |
| Did risky things | 62.0 | 7.6 |
| Had unprotected sex | 28.4 | 6.0 |
| Used sex to get drugs | 21.5 | 4.4 |
| Taken advantage of sexually | 9.8 | 2.4 |
|
| ||
| Lifetime | ||
|
| ||
| Injected a substance | 39.0 | 10.6 |
| Used someone else’s needle | 15.0 | 5.6 |
| HIV/AIDS | 1.1 | 2.7 |
| Sexually transmitted disease | 25.8 | 15.2 |
| Liver problems | 23.4 | 7.1 |
The response options for these items were “never,” “rarely,” “sometimes,” “often,” or “almost always.” The percentages reflect endorsements at the level of “sometimes” or higher.
Alcohol use in the past three months was common in both samples: 72% of the clinical sample and 85% of the community sample. Finally, 86% of the clinical sample were current smokers versus 45% for the community sample. High-risk behaviors and medical consequences associated with such behaviors are summarized in Table 3. In summary, the levels and patterns of substance use differed in the two samples (as expected), enhancing the generalizability of the item banks across clinical and community populations.
3.2. Frequency distributions
Within the initial pool of 119 items, there were no items with response categories having less than 1% response. There were 7 items (6%) having at least one response category with between 1% and 3% response. However, the sparse cells for all 7 items had at least 18 respondents, ranging from 18 to 39. Therefore, we retained all 5 response categories for all items for further analyses.
3.3. Factor analyses
3.3.1. Exploratory factor analyses
The initial EFA of the 119 items with a 3-month time frame (n = 276) yielded 10 factors with eigenvalues greater than 1, and the initial EFA with a 30-day time frame (n = 413) yielded 11 factors with eigenvalues greater than 1. In both analyses, however, the first factor (with eigenvalues of 76.7 for 3-months and 78.0 for 30-days) dominated, with ratios of the first to second eigenvalue of 9.92 (3 months) and 13.1 (30 days). In all analyses, this first factor included a large subset of items with high factor loadings reflecting many aspects of substance use, and it appeared to be a general indicator of the severity of use.
Inspection of 1- through 5-factor solutions for the 3-month data supported a 2-factor solution as most interpretable, with items on the second factor reflecting recreational use of drugs, pleasurable aspects of the experience, and perceived positive consequences of drug use. Additional factors were less interpretable and appeared to represent “splinters” (i.e., small subsets of items with modest loadings on a primary factor and often with cross-loadings on other factors). Inspection of 1- through 5-factor solutions for the 30-day data supported a 3-factor solution as most interpretable, with the second and third factors including the items regarding positive aspects of drug use that loaded significantly on factor 2 with the 3-month data. Taken together, the EFAs documented the importance of 2 subsets of items: 95 items best understood as measuring the severity of substance use, and 22 items reflecting the positive appeal of substance use.
A second round of EFAs examined the two subsets of items separately in both the 3-month and 30-day data. Despite the large number of items in the severity subset, a 1-factor solution was best supported with both time frames. The eigenvalue for the first factor was 68.5 in the 3-month data and 69.3 in the 30-day data, producing ratios of the first to second eigenvalue of 21.2 in the 3-month data and 26.7 in the 30-day data. We retained the entire pool of 95 items for CFA and IRT calibration, all of which had factor loadings greater than .60 in the repeat EFAs. The repeat EFAs with the subset of 22 items reflecting positive appeal of substance use also supported a 1-factor solution. In this case, however, we chose to delete 2 items with smaller factor loadings (less than .50).
3.3.2. Confirmatory factor analyses
We performed single-factor CFAs on the reduced item pools, using the second half of the samples (3-month time frame, n = 295; 30-day time frame, n = 352). For the pool of 95 general severity items, there was excellent evidence of unidimensionality. For the 3-month data, all factor loadings were greater than .62, and several fit indices were uniformly strong: CFI = .982, TLI = .982, and RMSEA = .042. For the 30-day data, the results were similar: all factor loadings greater than .55, CFI = .981, TLI = .980, and RMSEA = .044. The smaller pool of 20 items reflecting positive appeal of substance use was less homogeneous in CFA terms (reflected primarily in a larger RMSEA), but the results were adequate to document that responses to these items were largely a function of a single underlying latent construct. For the 3-month data, all factor loadings were greater than .47, and the fit indices were CFI = .938, TLI = .928, and RMSEA = .146. For the 30-day data, the results were again similar: all factor loadings greater than .50, CFI = .939, TLI = .929, and RMSEA = .149.
3.4. IRT calibrations
The two item pools—95 items for severity of substance use and 20 items for positive appeal of substance use—were calibrated separately, and the calibrations were done twice, first with the 3-month data and then with the 30-day data, using the two-parameter GRM. IRT modeling of the 95-item severity pool with both time frames revealed many locally dependent item pairs, suggesting considerable redundancy among the items. To manage this issue, we identified 19 “core” items. These items were selected using both clinical and patient perspectives. From a clinical perspective, we included items reflecting the diagnostic criteria for substance use disorder (i.e., commonly accepted markers of the presence and severity of substance use), and from the patient perspective, we included items reflecting themes that were prominent in our focus groups (e.g., consequences of substance use, intended versus actual effects of use). After these core items were selected, we added back 34 other items that were locally independent from the 19 core items, resulting in a new set of 53 items. We repeated the IRT calibrations with these items, again using the 3-month and 30-day data separately. The item parameter estimates, test information curves, and standard error estimates were very similar for the two time frames. We computed intraclass correlation coefficients (ICCs) between the time frames for the discrimination parameters and the four location parameters. These ICCs were large: 0.95 for the discrimination parameter, and 0.98, 0.98, 0.97, and 0.97, respectively, for the four location parameters. Given these results, we combined the 3-month and 30-day data to compute final, definitive IRT item parameters. For the combined sample, the n was 1,336. From the pool of 53 severity items, we identified 16 items that were locally dependent following this final calibration, leaving an item bank of 37 items.
The IRT results with the pool of 20 items reflecting the positive appeal of substance use also showed that separate calibrations with the 3-month and 30-day data were similar. The ICCs for the item parameters between the two time frames were 0.96 for the discrimination parameter and 0.99 for all four location parameters. Therefore, we again used the combined data to compute final IRT item parameters. Calibration with the combined data revealed 2 items that were locally dependent, and they were removed from the bank, leaving 18 items for the final version.
DIF analyses for both item banks showed that the IRTPRO and logistic regression methods did not jointly identify any items displaying DIF on the basis of age, gender, education, or more versus less severe substance use; thus, no items were eliminated for this reason. The Flesch-Kincaid readability test was performed on the 55 items in the two banks. The grade level for the 37 severity items in the aggregate was grade 2.5; for the 18 positive appeal items, it was grade 2.3. Tables 4 and 5 summarize the item banks, together with their IRT parameters. Figures 1 and 2 display the test information curves (and plots of corresponding standard errors). Standard errors of .30 correspond approximately to classical test theory reliabilities of .90. At this threshold, the effective range of measurement for the severity of use item bank was about −.6 to +2.7 SDs and for the positive appeal of substance use item bank, about −.8 to +2.2 SDs.
Table 4.
Calibrated Items: Severity of Substance Use
| Item Stem | Slope (Discrimination) | Location Thresholds
|
|||
|---|---|---|---|---|---|
| Level 1 | Level 2 | Level 3 | Level 4 | ||
| I felt that my drug use was out of control* | 6.56 | 0.47 | 0.71 | 0.93 | 1.11 |
| My desire to use drugs seemed overpowering* | 6.19 | 0.35 | 0.57 | 0.94 | 1.24 |
| Drugs were the only thing I could think about* | 5.83 | 0.45 | 0.74 | 1.08 | 1.39 |
| I felt I needed help for my drug problem | 5.65 | 0.42 | 0.61 | 0.77 | 0.97 |
| My drug use caused problems with people close to me* | 5.28 | 0.39 | 0.60 | 0.90 | 1.21 |
| I spent more time using drugs than I intended | 4.97 | 0.37 | 0.64 | 0.91 | 1.21 |
| I have a drug problem* | 4.97 | 0.19 | 0.46 | 0.68 | 0.90 |
| I did whatever I had to do to use drugs | 4.68 | 0.57 | 0.87 | 1.17 | 1.43 |
| Using drugs caused problems for me | 4.65 | 0.36 | 0.61 | 0.91 | 1.21 |
| I am addicted to drugs | 4.65 | 0.18 | 0.47 | 0.66 | 0.93 |
| I sold my belongings to buy drugs | 4.58 | 0.80 | 0.99 | 1.28 | 1.56 |
| Someone told me that I should cut down on my drug use | 4.49 | 0.40 | 0.67 | 1.00 | 1.33 |
| My drug use kept me from getting things done that I needed to do | 4.35 | 0.24 | 0.54 | 0.95 | 1.29 |
| I used drugs when I felt guilty | 4.19 | 0.51 | 0.74 | 1.16 | 1.45 |
| I needed to use more drugs to get the same effect I used to get | 4.18 | 0.30 | 0.55 | 0.89 | 1.23 |
| I spent a lot of time using drugs* | 4.05 | 0.11 | 0.49 | 0.89 | 1.20 |
| When I used drugs, I failed to take care of myself | 3.88 | 0.52 | 0.82 | 1.21 | 1.56 |
| When I used drugs, I experienced emotional problems | 3.79 | 0.36 | 0.66 | 1.08 | 1.51 |
| I planned my activities around using drugs | 3.64 | 0.31 | 0.59 | 1.05 | 1.41 |
| I used drugs to avoid my responsibilities | 3.53 | 0.65 | 0.90 | 1.33 | 1.65 |
| I did risky things while using drugs | 3.49 | 0.45 | 0.78 | 1.28 | 1.67 |
| I did things that could have gotten me into trouble while using drugs | 3.37 | 0.40 | 0.68 | 1.18 | 1.60 |
| When I used drugs, I felt alone | 3.17 | 0.44 | 0.76 | 1.23 | 1.64 |
| I used drugs to feel “normal” | 2.98 | 0.25 | 0.46 | 0.90 | 1.24 |
| I bought drugs for other people, but kept them for my own use | 2.98 | 0.97 | 1.23 | 1.58 | 1.94 |
| When I stopped using drugs, I got headaches | 2.95 | 0.54 | 0.84 | 1.35 | 1.75 |
| I almost died because of my drug use | 2.91 | 1.26 | 1.55 | 1.89 | 2.34 |
| I saved my money to buy drugs | 2.83 | 0.35 | 0.60 | 1.11 | 1.54 |
| When I used drugs, I got into fights. | 2.78 | 0.88 | 1.27 | 1.73 | 2.07 |
| When I used drugs, I hid my drug use from others. | 2.77 | 0.13 | 0.41 | 0.82 | 1.16 |
| When I used drugs, I felt anxious. | 2.66 | 0.29 | 0.74 | 1.29 | 1.72 |
| I used drugs to reward myself. | 2.64 | 0.18 | 0.51 | 1.08 | 1.54 |
| I used drugs because nearly all the people I know use drugs. | 2.35 | 0.40 | 0.80 | 1.28 | 1.76 |
| I used drugs to feel more alert. | 2.29 | 0.46 | 0.77 | 1.37 | 1.85 |
| I used drugs because people pressured me to use. | 2.08 | 1.14 | 1.69 | 2.34 | 2.80 |
| I drove a car while under the influence of drugs. | 1.80 | 0.53 | 0.93 | 1.45 | 2.14 |
Note. Items are rank-ordered on the basis of their slope (discrimination) parameters. Items included in the short form are marked with an asterisk.
Table 5.
Calibrated Items: Positive Appeal of Drug Use
| Item Stem | Slope (Discrimination) | Location Thresholds
|
|||
|---|---|---|---|---|---|
| Level 1 | Level 2 | Level 3 | Level 4 | ||
| I used drugs to feel more confident* | 6.30 | 0.45 | 0.65 | 0.96 | 1.24 |
| I used drugs to feel good about myself* | 5.58 | 0.43 | 0.63 | 0.95 | 1.23 |
| I used drugs to make it easier to talk to people* | 5.12 | 0.45 | 0.63 | 1.02 | 1.35 |
| Drugs made me feel like I could do anything* | 4.94 | 0.50 | 0.71 | 1.02 | 1.27 |
| I used drugs to feel close to someone | 4.01 | 0.63 | 0.85 | 1.26 | 1.61 |
| I used drugs to change my mood* | 3.81 | −0.05 | 0.16 | 0.69 | 1.08 |
| While using drugs, I liked myself better | 3.66 | 0.46 | 0.74 | 1.11 | 1.42 |
| I used drugs to have a good time* | 3.47 | −0.18 | 0.10 | 0.61 | 1.04 |
| I used drugs to be a part of something | 3.41 | 0.67 | 0.96 | 1.40 | 1.72 |
| I used drugs to be open to new experiences | 3.38 | 0.45 | 0.77 | 1.25 | 1.58 |
| I used drugs to be more creative | 3.32 | 0.40 | 0.66 | 1.19 | 1.61 |
| People seemed to like me better when I used drugs | 3.21 | 0.57 | 0.90 | 1.33 | 1.74 |
| I used drugs because I liked the feeling* | 2.92 | −0.49 | −0.16 | 0.23 | 0.64 |
| I used drugs to relax | 2.87 | −0.37 | −0.11 | 0.46 | 1.01 |
| I used drugs while enjoying TV, music, or video games | 2.71 | −0.14 | 0.11 | 0.62 | 1.07 |
| I enjoyed using drugs | 2.51 | −0.54 | −0.13 | 0.35 | 0.83 |
| I used drugs recreationally | 2.22 | −0.38 | −0.01 | 0.59 | 1.04 |
| I used drugs when I was with other people | 2.17 | −0.36 | 0.06 | 0.72 | 1.24 |
Note. Items are rank-ordered on the basis of their slope (discrimination) parameters. Items included in the short form are marked with an asterisk.
Figure 1.
Figure 2.
3.5. Selection of items for short forms
For some applications where CAT is not feasible, static short forms may be a useful alternative. To develop short forms, we rank ordered all severity of use and positive appeal of use items on four criteria: discrimination parameters, the percentage of times the item would have been selected in a simulated CAT for each item bank based on the observed data from our calibration sample, expected information under the standard normal distribution with a mean of 0 and SD of 1, and expected information under a normal distribution with a larger SD, i.e., a mean of 0 and SD of 1.5 (Choi et al., 2010). The CAT simulations were performed using the Firestar program (Choi, 2009). For the CAT simulations, we set the minimum number of items to be administered to 8 and the maximum number of items to be administered to be the full bank. We selected 7 items for each short form based on the convergence of the four psychometric criteria, the content of candidate items, and location parameters (i.e., we tried to include some items with lower thresholds to increase the precision of the short forms closer to the floor). In Tables 4 and 5, asterisks identify the items selected for the short forms.
The internal consistency of the short forms was excellent. Alpha coefficients were .94 for the severity of use short form and .92 for the positive appeal of substance use short form. Correlations between the theta scores derived from the short forms and the corresponding full item banks were large: .94 for severity of use and .96 for positive appeal of use.
3.6. Convergence with DSM-5 diagnostic criteria
Inspection of the 37 items in the severity item bank revealed that 23 of them could be mapped onto the 11 diagnostic criteria for substance use disorder now included in the DSM-5 (see Table 6). Each criterion was represented by at least one item (up to a maximum of four). The 14 items distinct from the diagnostic criteria reflected a variety of other themes: acknowledgement of having a drug problem, ways of procuring drugs, emotional triggers and consequences of substance use, and social motives and pressures to use drugs.
Table 6.
DSM-5 Criteria and Corresponding Severity of Substance Use Items
| DSM-5 Criterion |
| Severity of Substance Use Item |
| A1. Uses more than intended, or for longer than intended |
| I spent more time using drugs than I intended |
|
|
| A2. Efforts to control or cut back on use have been unsuccessful |
| I felt that my drug use was out of control* |
|
|
| A3. Large amounts of time are spent obtaining, using, or recovering from use |
| I spent a lot of time using drugs* |
|
|
| A4. Cravings (the presence of a strong desire to use) |
| My desire to use drugs seemed overpowering* |
|
|
| Drugs were the only thing I could think about* |
| I craved drugs* |
|
|
| A5. Recurrent use resulting in problems at work, home, or school |
| Using drugs caused problems for me |
| My drug use kept me from getting things done that I needed to do |
|
|
| A6. Continued use despite recurrent social or interpersonal problems resulting from use |
| My drug use caused problems with people close to me* |
| When I used drugs, I got into fights |
| When I used drugs, I hid my drug use from others |
|
|
| A7. Curtailing important activities in favor of use |
| I planned my activities around using drugs |
| I used drugs to avoid my responsibilities |
|
|
| A8. Use despite potentially hazardous outcomes |
| I did risky things while using drugs |
| I did things that could have gotten me into trouble while using drugs |
| I almost died because of my drug use |
| I drove a car while under the influence of drugs |
|
|
| A9. Continued use despite its causing or exacerbating a persistent physical or psychological problem |
| When I used drugs, I failed to take care of myself |
| When I used drugs, I experienced emotional problems |
| When I used drugs, I felt alone |
| When I used drugs, I felt anxious |
|
|
| A10. Tolerance or a need for increased amounts of substance |
| I needed to use more drugs to get the same effect I used to get |
|
|
| A11. Withdrawal symptoms |
| When I stopped using drugs, I got headaches |
Note. Items included in the short form are marked with an asterisk.
4. Discussion
The mixed (qualitative and quantitative) PROMIS methodology was used to create two item banks relevant to substance use: a 37-item bank measuring severity of substance use and an 18-item bank measuring positive appeal of substance use. As expected, we found an overall severity factor (and item bank). However, the item bank for positive appeal of use is more novel. This second item bank for positive appeal may be especially relevant for community samples, and it is timely because legalization of marijuana (for both medical and recreational purposes) is increasingly common. In the calibration data from the community sample, cannabis was the most commonly used substance (current and lifetime), and it generated the largest ASSIST SI correlation with the theta score for positive appeal of use, r = .61 (as well as the severity theta score, r = .56). In general, there were stronger relationships in the community sample between SI scores for all substances and positive appeal of use.
Given its content, the item bank for severity of use could be used for the development of a short form that captures the DSM-5 diagnostic criteria for substance use disorder (see Table 6). The PROMIS mandate is to create measures of severity (not diagnosis) that can be used as common metrics across all health states and diseases. However, it is typical for investigators and clinicians to ask about the relevance of PROMIS measures for screening and diagnostic purposes and about thresholds on the severity metric that should motivate clinical concern or intervention. A short form focused on diagnostic criteria might be useful clinically. At the same time, it must be emphasized that the mapping of the PROMIS items to the DSM-5 criteria is preliminary and that the validity of the items in relation to both the individual DSM-5 criteria and the presence of a diagnosis requires further investigation. Five of the criteria are linked to only one PROMIS item, and “physical problems” in criterion A9 are not well represented.
In general, there is a need for further studies of the validity of the new item banks. Data from the calibration sample provided only preliminary evidence of validity, e.g., correlations with ASSIST SI scores (see the supplement available online2). Validity studies can take many forms, but consistent with past PROMIS precedents (Pilkonis et al., 2014) they should include (a) psychometric studies comparing the operating characteristics of the PROMIS item banks and short forms with other commonly used legacy measures and (b) longitudinal studies of responsiveness to change in circumstances where one would (or would not) expect variability over time.
In this context, it is important to note that analysis of longitudinal data regarding substance use presents special challenges. Before asking about severity of substance use, the item banks require a screening question to document the presence of some use: “In the past 30 days, have you used drugs (other than alcohol or your prescribed medications)?” Thus, the item banks provide information about two related but different phenomena: exposure to drugs and severity of use when exposed. Analyzing such data requires novel models, and an approach we have found useful is the two-part semi-continuous model (2001). The model allows simultaneous analysis of binary (yes/no) and continuous (severity) data, yielding parameter estimates (intercepts and slopes) for functions that model both exposure and severity. Other discussions of analyzing data in which information about severity is contingent on the presence of some behavior or event are also available (e.g.,Liu and Verkuilen, 2013; Reardon and Raudenbush, 2006). It is important to acknowledge that there is only a single screening question for the new item banks and that the use of a single question may limit reliability and validity. At the same time, there are many legacy instruments for assessing exposure to drugs, including items regarding the types of drugs used and the frequency and quantity of use. We used the ASSIST for just this purpose, and other investigators may also wish to include companion measures that provide more detailed information regarding exposure to drugs. It should also be noted that abuse of prescription pain medication (obtained either medically or illicitly) is an increasing public health problem. A separate PROMIS item bank has been created for this purpose, and a description of its development and calibration will be the subject of a separate report.
In summary, the development of two new item banks for substance use—severity of use and positive appeal of use—adds to the existing body of PROMIS measures. The initial psychometric characteristics of the item banks support their use as CATs or short forms, with either version providing brief, precise, and efficient measures relevant to both clinical and community samples. Further studies of the validity of the item banks are now appropriate to support their adoption.
Supplementary Material
Highlights.
Severity and positive appeal of substance use item banks were developed for Patient-Reported Outcomes Measurement Information System (PROMIS).
PROMIS substance use item banks are suitable in clinical and community settings.
The item banks are applicable with either 30-day or 3-month timeframes.
Seven-item static short forms were also developed from each item bank.
Acknowledgments
We would also like to thank our collaborators at the recruitment sites: Center for Psychiatric and Chemical Dependency Services, Pittsburgh, PA (Dorothy J. Sandstrom, MS; Janis McDonald; Trey Ghee, MSW); CODA, Inc., Portland, OR (Katharina Wiest, PhD; Rosalie Gordon, BA; Kasie Cloud, MSW), Coastal Horizons Center, Inc., Wilmington, NC (Kenny S. G. House, BA, LCAS, CCS; Michael Younkins, BS, BBA), and Southlight Healthcare, Raleigh, NC (Alyssa H. Kalata, PhD, Scott Luetgenau, BSW, Reynolds (Tad) Clodfelter, Jr., PsyD).
Footnotes
Supplementary material can be found by accessing the online version of this paper at http://dx.doi.org and by entering doi:…
Supplementary material can be found by accessing the online version of this paper at http://dx.doi.org and by entering doi:…
Contributors
Paul A. Pilkonis, PhD, contributed to study conception, design, and implementation and took responsibility for drafting the manuscript. Lan Yu, PhD, provided data analysis and interpretation. Nathan Dodds, BS, Kelly L. Johnston, MPH, Suzanne Lawrence, MA, Thomas F. Hilton, PhD, Dennis C. Daley, PhD, Ashwin A. Patkar, MD and Dennis McCarty, PhD contributed to study implementation (literature reviews, conceptual organization of items, item and intellectual property reviews) and data collection (focus groups, cognitive interviews, coordination of field testing for calibration). All authors reviewed and approved the final manuscript.
Conflict of Interest
There are no conflicts of interest for any authors.
Author Disclosures
Role of Funding Source
PROMIS® was funded with cooperative agreements from the National Institutes of Health (NIH) Common Fund Initiative (Northwestern University, PI: David Cella, PhD, U54AR057951, U01AR052177; Northwestern University, PI: Richard C. Gershon, PhD, U54AR057943; American Institutes for Research, PI: Susan (San) D. Keller, PhD, U54AR057926; State University of New York, Stony Brook, PIs: Joan E. Broderick, PhD and Arthur A. Stone, PhD, U01AR057948, U01AR052170; University of Washington, Seattle, PIs: Heidi M. Crane, MD, MPH, Paul K. Crane, MD, MPH, and Donald L. Patrick, PhD, U01AR057954; University of Washington, Seattle, PI: Dagmar Amtmann, PhD, U01AR052171; University of North Carolina, Chapel Hill, PI: Harry A. Guess, MD, PhD (deceased), Darren A. DeWalt, MD, MPH, U01AR052181; Children’s Hospital of Philadelphia, PI: Christopher B. Forrest, MD, PhD, U01AR057956; Stanford University, PI: James F. Fries, MD, U01AR052158; Boston University, PIs: Alan Jette, PT, PhD, Stephen M. Haley, PhD (deceased), and David Scott Tulsky, PhD (University of Michigan, Ann Arbor), U01AR057929; University of California, Los Angeles, PIs: Dinesh Khanna, MD (University of Michigan, Ann Arbor) and Brennan Spiegel, MD, MSHS, U01AR057936; University of Pittsburgh, PI: Paul A. Pilkonis, PhD, U01AR052155; Georgetown University, PIs: Carol. M. Moinpour, PhD (Fred Hutchinson Cancer Research Center, Seattle) and Arnold L. Potosky, PhD, U01AR057971; Children’s Hospital Medical Center, Cincinnati, PI: Esi M. Morgan DeWitt, MD, MSCE, U01AR057940; University of Maryland, Baltimore, PI: Lisa M. Shulman, MD, U01AR057967; and Duke University, PI: Kevin P. Weinfurt, PhD, U01AR052186). NIH Science Officers on this project have included Deborah Ader, PhD, Vanessa Ameen, MD (deceased), Susan Czajkowski, PhD, Basil Eldadah, MD, PhD, Lawrence Fine, MD, DrPH, Lawrence Fox, MD, PhD, Lynne Haverkos, MD, MPH, Thomas Hilton, PhD, Laura Lee Johnson, PhD, Michael Kozak, PhD, Peter Lyster, PhD, Donald Mattison, MD, Claudia Moy, PhD, Louis Quatrano, PhD, Bryce Reeve, PhD, William Riley, PhD, Peter Scheidt, MD, Ashley Wilder Smith, PhD, MPH, Susana Serrate-Sztein, MD, William Phillip Tonkins, DrPH, Ellen Werner, PhD, Tisha Wiley, PhD, and James Witter, MD, PhD. An award from the National Institute on Drug Abuse (UG1 DA015815) provided additional support for Dennis McCarty’s participation. This article uses data developed under PROMIS. These contents do not necessarily represent an endorsement by the US Government or PROMIS. See www.nihpromis.org for additional information on the PROMIS® initiative.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Ali R, Meena S, Eastwood B, Richards I, Marsden J. Ultra-rapid screening for substance-use disorders: the Alcohol, Smoking, and Substance Involvement Screening Test (ASSIST-Lite) Drug Alcohol Depend. 2013;132:352–361. doi: 10.1016/j.drugalcdep.2013.03.001. [DOI] [PubMed] [Google Scholar]
- Buysse D, Yu L, Moul DE, Germain A, Stover A, Dodds NE, Johnston KL, Shablesky-Cade MA, Pilkonis PA. Development and validation of patient-reported outcome measures for sleep disturbance and sleep-related impairments. Sleep. 2010;33:781–792. doi: 10.1093/sleep/33.6.781. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cai L, Thissen D, du Toit SHC. IRTPRO: Flexible, Multidimensional, Multiple Categorical IRT Modeling [Computer Software] Scientific Software International; Lincolnwood, IL: 2011. [Google Scholar]
- Castel LD, Williams KA, Bosworth HB, Eisen SV, Hahn EA, Irwin DE, Kelly MAR, Morse J, Stover A, DeWalt DA, DeVellis RF. Content validity in the PROMIS social health domain: a qualitative anaylsis of focus group data. Qual Life Res. 2008;17:737–749. doi: 10.1007/s11136-008-9352-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cella D, Riley W, Stone A, Rothrock N, Reeve BB, Yount S, Amtmann D, Bode R, Buysse D, Choi S, Cook K, DeVellis R, DeWalt D, Fries JF, Gershon R, Hahn EA, Lai JS, Pilkonis P, Revicki D, Rose M, Weinfurt K, Hays R. The patient-reported outcomes measurement information system (PROMIS) developed and tested its first wave of adult self-reported health outcome item bank: 2005–2008. J Clin Epidemiol. 2010;63:1179–1194. doi: 10.1016/j.jclinepi.2010.04.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Choi S. Firestar: Computerized adaptive testing simulation program for polytomous item response theory models. Appl Psychol Meas. 2009;33:644–645. [Google Scholar]
- Choi SW, Reise SP, Pilkonis PA, Hays RD, Cella D. Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms. Qual Life Res. 2010;19:125–136. doi: 10.1007/s11136-009-9560-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cox WM, Klinger E, Fadardi JS. The motivational basis of cognitive determinants of addictive behaviors. Addict Behav. 2015;44:16–22. doi: 10.1016/j.addbeh.2014.11.019. [DOI] [PubMed] [Google Scholar]
- Dow SJ, Kelly JF. Listening to youth: Adolescents’ reasons for subtance use as a unique predictor of treatment response and outcome. Psychol Addict Beh. 2013;27:1122–1131. doi: 10.1037/a0031065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fries JF, Cella D, Rose M, Krishnan E, Bruce B. Progress in assessing physical function in arthritis: PROMIS short forms and computerized adaptive testing. J Rheumatol. 2009;36:2061–2066. doi: 10.3899/jrheum.090358. [DOI] [PubMed] [Google Scholar]
- Grant BF, Dawson DA. Age of onset of drug use and its association with DSM-IV drug abuse and dependence: results from the national longitudinal alcohol epidemiologic survey. J Subst Abuse. 1998;10:163–173. doi: 10.1016/s0899-3289(99)80131-x. [DOI] [PubMed] [Google Scholar]
- Green BA, Ahmed AO, Marcus DK, Walters GD. The latent structure of alcohol use pathology in an epidemiological sample. J Psychiatr Res. 2011;45:225–233. doi: 10.1016/j.jpsychires.2010.06.001. [DOI] [PubMed] [Google Scholar]
- Hartman CA, Gerlhorn H, Crowley T, Sakai JT, Stallings M, Young SE, Rhee SH, Corley R, Hewitt JK, Hopfer CJ. Item response theory analysis of DSM-IV cannabis abuse and dependence criteria in adolescents. J Am Acad Child Adolesc Psychiatry. 2008;47:165–173. doi: 10.1097/chi.0b013e31815cd9f2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hasin DS, O’Brien CP, Auriacombe M, Borges G, Bucholz K, Budney A, Compton WM, Crowley T, Ling W, Petry NM, Schuckit M, Grant BF. DSM-5 criteria for substance use disorders: recommendations and rationale. Am J Psychiatry. 2013;170:834–851. doi: 10.1176/appi.ajp.2013.12060782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hendryx MS, Haviland MG, Gibbons RD, Clark DC. An application of item response theory to alexithymia assessment among abstinent alcoholics. J Pers Assess. 1992;58:506–515. doi: 10.1207/s15327752jpa5803_6. [DOI] [PubMed] [Google Scholar]
- Hilton TF. The promise of PROMIS® for addiction. Drug Alcohol Depend. 2011;119:229–234. doi: 10.1016/j.drugalcdep.2011.09.031. [DOI] [PubMed] [Google Scholar]
- Janulis P. Improving measurement of injection drug risk behavior using item response theory. Am J Drug Alcohol Abuse. 2014;40:143–150. doi: 10.3109/00952990.2013.848212. [DOI] [PubMed] [Google Scholar]
- Jones BT, Corbin W, Fromme K. A review of expectancy theory and alcohol consumption. Addiction. 2001;96:57–72. doi: 10.1046/j.1360-0443.2001.961575.x. [DOI] [PubMed] [Google Scholar]
- Kelly MA, Morse JQ, Stover A, Hofkens T, Huisman E, Shulman S, Eisen SV, Becker SJ, Weinfurt K, Boland E, Pilkonis PA. Describing depression: congruence between patient experiences and clinical assessments. Br J Clin Psychol. 2011;50:46–66. doi: 10.1348/014466510X493926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- King KM, Chassin L. A prospective study of the effects of age of initiation of alcohol and drug use on young adult substance dependence. J Stud Alcohol Drugs. 2007;68:256–265. doi: 10.15288/jsad.2007.68.256. [DOI] [PubMed] [Google Scholar]
- Klem ML, Saghafi E, Abromitis R, Stover A, Dew MA, Pilkonis PA. Building PROMIS item banks: Librarians as co-investigators. Qual Life Res. 2009;18:881–888. doi: 10.1007/s11136-009-9498-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Korcha RA, Polcin DL, Bond JC, Lapp WM, Galloway G., Pharm D Substance use and motivation: a longitudinal perspective. Am J Drug Alcohol Abuse. 2011;37:48–53. doi: 10.3109/00952990.2010.535583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krueger RF, Nichol PE, Hicks BM, Markon KE, Patrick CJ, Iacono WG, McGue M. Using latent trait modeling to conceptualize an alcohol problems continuum. Psychol Assess. 2004;16:107–119. doi: 10.1037/1040-3590.16.2.107. [DOI] [PubMed] [Google Scholar]
- Langenbucher JW, Labouvie E, Martin CS, Sanjuan PM, Bavly L, Kirisci L, Chung T. An application of item response theory analysis to alcohol, cannabis, and cocaine criteria in DSM-IV. J Abnorm Psychol. 2004;113:72–80. doi: 10.1037/0021-843X.113.1.72. [DOI] [PubMed] [Google Scholar]
- Liu Y, Verkuilen J. Item response modeling of presence-severity items: application to measurement of patient-reported outcomes. Appl Psychol Meas. 2013;37:58–75. [Google Scholar]
- Lynskey MT, Agrawal A. Psychometric properties of DSM assessments of illicit drug abuse and dependence: results from the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) Psychol Med. 2007;37:1345–1355. doi: 10.1017/S0033291707000396. [DOI] [PubMed] [Google Scholar]
- McCarty D, Fuller B, Kaskutas LA, Wendt WW, Nunes EV, Miller M, Forman R, Magruder KM, Arfken C, Copersino M, Floyd A, Sindelar J, Edmundson E. Treatment programs in the National Drug Abuse Treatment Clinical Trials Network. Drug Alcohol Depend. 2008;92:200–207. doi: 10.1016/j.drugalcdep.2007.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muthén B. Should substance use disorders be considered as categorical or dimensional? Addiction. 2006;101:6–16. doi: 10.1111/j.1360-0443.2006.01583.x. [DOI] [PubMed] [Google Scholar]
- Muthén LK, Muthén B. Mplus User’s Guide (Version 6.2) Muthén & Muthén; Los Angeles, CA: 2010. [Google Scholar]
- Olsen MK, Schafer JL. A two-part random effects model for semicontinuous longitudinal data. J Am Stat Assoc. 2001;96:730–745. [Google Scholar]
- Pabst A, Baumeister SE, Kraus L. Alcohol-expectancy dimensions and alcohol consumption at different ages in the general population. J Stud Alcohol Drugs. 2009;71:46–53. doi: 10.15288/jsad.2010.71.46. [DOI] [PubMed] [Google Scholar]
- Pilkonis PA, Choi SW, Reise SP, Stover AM, Riley WT, Cella D. Item banks for measuring emotional distress from the Patient-Reported Outcomes Measurement Information System (PROMIS®): depression, anxiety, anger. Assessment. 2011;18:263–283. doi: 10.1177/1073191111411667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pilkonis PA, Yu L, Colditz J, Dodds NE, Johnston KL, Maihoefer C, Stover AM, Daley DC, McCarty D. Item banks for alcohol use from the Patient-Reported Outcomes Measurement Information System (PROMIS®): use, consequences, and expectancies. Drug Alcohol Depend. 2013;130:167–177. doi: 10.1016/j.drugalcdep.2012.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pilkonis PA, Yu L, Dodds NE, Johnston KL, Maihoefer C, Lawrence SM. Validation of the depression item bank from the Patient-Reported Outcomes Measurement Information System (PROMIS®) in a three-month observational study. J Psychiatry Res. 2014;56:112–119. doi: 10.1016/j.jpsychires.2014.05.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reardon S, Raudenbush SW. A partial independence item response model for surveys in which responses to filter questions determine whether subsequent questions are asked. Sociol Methodol. 2006;36:257–300. [Google Scholar]
- Reeve BB, Hays RD, Bjorner JB, Cook KF, Crane PK, Teresi JA, Thissen D, Revicki DA, Weiss DJ, Hambleton RK, Liu H, Gershon R, Reise SP, Cella D. Psychometric evaluation and calibration of Health-Related Quality of Life item banks: plans for the Patient-Reported Outcome Measurement Information System (PROMIS) Med Care. 2007;45:S22–S31. doi: 10.1097/01.mlr.0000250483.85507.04. [DOI] [PubMed] [Google Scholar]
- Revicki D, Chen W, Harnam N, Cook K, Amtmann D, Callahan LF, Jensen MP, Keefe FJ. Development and psychometric analysis of the PROMIS pain behavior item bank. Pain. 2009;146:158–169. doi: 10.1016/j.pain.2009.07.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saha T, Compton WM, Chou SP, Smith S, Ruan WJ, Huang B, Pickering RP, Grant BF. Analyses related to the development of DSM-5 criteria for substance use related disorders: 1. toward amphetamine, cocaine and prescription drug use disorder continua using item response theory. Drug Alcohol Depend. 2012;122:38–46. doi: 10.1016/j.drugalcdep.2011.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Saha TD, Stinson FS, Grant BF. The role of alcohol consumption in future classification of alcohol use disorders. Drug Alcohol Depend. 2007;89:82–92. doi: 10.1016/j.drugalcdep.2006.12.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Samejima F. Psychometrika Monograph No. 17. Psychometric Society; Richmond, VA: 1969. Estimation Of Latent Ability Using A Response Pattern Of Graded Scores. [Google Scholar]
- Stinson FS, Grant BF, Dawson DA, Ruan WJ, Huang B, Saha T. Comorbidity between DSM-IV alcohol and specific drug use disorders in the United States: results from the National Epidemiologic Survey on Alcohol and Related Conditions. Drug Alcohol Depend. 2005;80:105–116. doi: 10.1016/j.drugalcdep.2005.03.009. [DOI] [PubMed] [Google Scholar]
- Stucky BD, Edelen MO, Ramchand R. A psychometric assessment of the GAIN Individual Severity Scale (GAIN-GISS) and Short Screeners (GAIN-SS) among adolescents in outpatient treatment programs. J Subst Abuse Treat. 2014;46:165–173. doi: 10.1016/j.jsat.2013.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Swendsen J, Conway KP, Degenhardt L, Dierker L, Glantz M, Jin R, Merikangas KR, Sampson N, Kessler RC. Socio-demographic risk factors for alcohol and drug dependence: the 10-year follow-up of the national comorbidity survey. Addiction. 2009;104:1346–1355. doi: 10.1111/j.1360-0443.2009.02622.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tai B, Straus MM, Liu D, Sparenborg S, Jackson R, McCarty D. The first decade of the National Drug Abuse Treatment Clinical Trials Network: bridging the gap between research practice to improve drug abuse treatment. J Subst Abuse Treat. 2010;38:S4–S13. doi: 10.1016/j.jsat.2010.01.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teresi JA, Ocepek-Welikson K, Kleinman M, Eimicke JP, Crane PK, Jones RN, Lai JS, Choi SW, Hays RD, Reeve BB, Reise SP, Pilkonis PA, Cella D. Analysis of differential item functioning in the depression item bank from the Patient Reported Outcome Measurement Information System (PROMIS): An item response theory approach. Psychol Sci Q. 2009;51:148–180. [PMC free article] [PubMed] [Google Scholar]
- Thissen D, Steinberg L, Wainer H. Detection of differential item functioning using the parameters of item response models. In: Holland PW, Wainer H, editors. Differential Item Functioning. Lawrence Erlbaum; Hillsdale, NJ: 1993. pp. 67–113. [Google Scholar]
- WHO ASSIST Working Group. The Alcohol, Smoking and Substance Involvement Screening Test (ASSIST): development, reliability and feasibility. Addiction. 2002;97:1183–1194. doi: 10.1046/j.1360-0443.2002.00185.x. [DOI] [PubMed] [Google Scholar]
- Wilkinson GS. The Wide Range Achievement Test: Manual. Wide Range; Wilmington, DE: 1993. [Google Scholar]
- Wilkinson GS, Robertson GJ. WRAT4: Wide Range Achievement Test Professional Manual. Psychological Assessment Resources; Lutz, FL: 2006. [Google Scholar]
- Zumbo BD. A Handbook on the Theory and Methods of Differential Item Functioning (DIF): Logistic Regression Modeling as a Unitary Framework for Binary and Likert-type (Ordinal) Item Scores. Directorate of Human Resources Research and Education, Department of National Defense; Ottawa, ON: 1999. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


