Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Nov 27.
Published in final edited form as: Assessment. 2020 Feb 10;27(6):1075–1088. doi: 10.1177/1073191120903092

Using Complete Enumeration to Derive “One-Size-Fits-All” versus “Subgroup-Specific” Diagnostic Rules for Substance Use Disorder

Cassandra L Boness 1, Jordan E Stevens 1, Douglas Steinley 1, Timothy Trull 1, Kenneth J Sher 1
PMCID: PMC7694888  NIHMSID: NIHMS1642595  PMID: 32037845

Abstract

The use of fixed diagnostic rules, whereby the same diagnostic algorithms are applied across all individuals regardless of personal attributes, has been the tradition in the Diagnostic and Statistical Manual of Mental Disorders. This practice of “averaging” across individuals inevitably introduces diagnostic error. Further, these average rules are typically derived through expert consensus rather than through data-driven approaches. Utilizing NSDUH 2013 (N = 23, 889), we examined whether subgroup-specific, “customized” alcohol use disorder diagnostic rules, derived using deterministic optimization, perform better than an average, “one-size-fits-all” diagnostic rule. The average solution for the full sample included a set size of six and diagnostic threshold of three. Subgroups had widely varying set sizes (M=6.870; range=5-10) with less varying thresholds (M=2.70; range=2-4). External validation verified that the customized algorithms performed as well, and sometimes better than, the average solution in the prediction of relevant correlates. However, the average solution still performed adequately with respect to external validators.

Keywords: Alcohol Use Disorder, Diagnosis, Optimization, Diagnostic Classifications, Assessment


The Fifth Edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5; APA, 2013a) largely takes a “one-size-fits-all” approach to diagnosis whereby the same diagnostic rules are applied across all individuals, rarely considering individual differences such as age, gender, and race/ethnicity. The DSM-5 uses what are known as “fixed diagnostic rules” (Finn, 1982). This means that, for a given disorder, the same diagnostic criteria set and diagnostic threshold (i.e., diagnostic rule) is applied across all individuals, regardless of individual differences. In the case of DSM-5 substance use disorder (AUD), for example, the diagnostic criteria set consists of 11 symptoms with a diagnostic threshold of 2 (i.e., number of symptoms needed to diagnose). Furthermore, diagnostic rules are largely based on the agreement of experts which sometimes includes the examination of group data, for example the National Epidemiological Survey on Alcohol and Related Conditions (e.g., Hasin et al., 2013). As a result, diagnostic rules are inherently based on averages, meaning there is always a degree of diagnostic error when applied to an individual (Finn, 1982). Whilst this practice was initially implemented to increase diagnostic reliability and feasibility of implementation, it has multiple shortcomings.

The exception to the practice of fixed diagnostic rules is age, although inconsistently considered across DSM-5 diagnoses. The DSM-5, in a few specific cases, presented updated criteria to more precisely capture the symptoms and experiences of children with a given disorder. As one example, the DSM-5 now includes a Post-Traumatic Stress Disorder (PTSD) subtype for children under age 6 (APA, 2013b), which presents a unique diagnostic rule to account for differences in symptom presentation. Although the DSM-5 appreciates age-related aspects of some mental disorders and the resulting need for different diagnostic rules (and even different diagnostic categories, such as conduct disorder) based on age, it fails to present different rules for individuals based on gender or race/ethnicity. Instead, gender and race/ethnicity are addressed by sections titled “Gender-Related Diagnostic Issues” and “Culture-Related Diagnostic Issues,” respectively, which provide brief text descriptions (APA, 2013a). Arguably, this fails to adequately capture the inherent complexity of mental disorder diagnosis or provide sufficient guidance to enable principled, population-specific diagnosis.

The lack of consideration of individual-level characteristics such as age, gender, and race/ethnicity across disorders may be problematic given (1) there are widely varying psychopathology prevalence rates based on these individual characteristics (e.g., Grant & Weissman, 2007; McLean, Asnaani, Litz, & Hoffman, 2011; Riolo et al., 2005), (2) criteria differ in their Item Response Theory (IRT) based severities depending on the characteristics of the sample (e.g., Lane, Steinley, & Sher, 2016), and (3) there exists a wealth of research demonstrating measurement bias (e.g. differential item functioning) in diagnostic criteria across these demographic subgroups (e.g., Hoertel et al., 2014; Balsis et al., 2007; Harford, Yi, Faden, & Chen, 2009; Srisurapanont et al., 2012). Taken together, these findings raise questions about the validity of diagnostic criteria across different subgroups of individuals as well as the rules used to combine the diagnostic criteria (i.e., the content and size of the criteria set and the diagnostic threshold). The current paper is mainly concerned with the latter.

Previous research has found that, specifically among racial/ethnic minoritized adolescents, there are significant differences in the sensitivity and/or specificity of certain Composite International Diagnostic Interview (CIDI) diagnoses (e.g., agoraphobia, panic disorder, PTSD, and ADHD; Greif Green, et al., 2012). Such differences can result in diagnostic misclassification (i.e., false positives or false negatives). When the authors modified the diagnostic rules for these disorders (e.g., by adding items) to improve disorder identification, such modifications reduced error in the prevalence estimates across the groups for agoraphobia and ADHD, but not PTSD or panic disorder. This suggests that diagnostic instruments based on DSM criteria may result in differential accuracy in racial/ethnic subgroups. Similarly, Wagner and colleagues (2002) demonstrated that, among adolescents, Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) Alcohol Use Disorder (AUD) symptoms result in significant subgroup (i.e., gender and race/ethnicity) variation with regard to their (a) incidence (i.e., if DSM-IV AUD symptoms appear) and (b) onset (i.e., when DSM-IV AUD symptoms appear) age, suggesting that DSM-IV criteria may not be the most suitable for adolescents. Applying a “one-size-fits-all” diagnostic rule across different demographic groups may, therefore, result in misleading prevalence rates and disorder misclassification (Grief Green, et al., 2012; Finn, 1982). Further, the “one-size-fits-all” procedure is probably more detrimental for minoritized groups given that most research is conducted among white Americans. This suggests a “customized” diagnostic approach based on individual demographics may increase diagnostic accuracy and may result in strong incremental validity that is worth considering.

“Customized” diagnostic algorithms can be derived using an empirical approach such as deterministic optimization. Optimization approaches empirically-derive algorithms based on a priori clinical correlates and outcomes that serve as derivation criterion (DV). This results in a level of transparency that is lacking in the traditional manner of deriving diagnostic algorithms via an expert consensus process, such as that used to derive the DSM diagnoses (Frances & Widiger, 2012; Wakefield, 2015). Optimization approaches have also been successfully used to create diagnostic short-forms (Raffo et al., 2019). As such, they offer an objective, empirical method for deriving diagnoses.

Using a complete-enumeration (i.e., generating all possible subsets of item combinations), deterministic optimization approach newly developed by Steinley et al. (2016) and expanded upon by Stevens et al. (2018) and Stevens et al., (2019), Boness, Stevens, Steinley, Trull, and Sher (2018), utilizing the National Survey on Drug Use and Health (NSDUH) 2010 and 2013, demonstrated that AUD diagnostic rules can be derived using a data-driven approach that considers important attributes such as AUD base rate and relevant correlates (e.g., consumption). They further demonstrated that the newly derived diagnostic rules perform well with respect to external validators (e.g., treatment usage) and possess increased diagnostic efficiency above and beyond the DSM-5. Although the focus of the Boness et al. paper was on developing optimal rules for AUD, the approach is widely applicable to all psychiatric diagnoses. Even though this study provided a starting place for improving upon fixed diagnostic rules, the resulting diagnostic rule described in Boness et al. is still a global average and, therefore, fails to consider individual differences. Rather than considering demographic-neutral (e.g., gender-neutral) diagnostic criteria, which is highly unlikely to be successful (see Hartung & Widiger, 1998), it may be worth considering “customized” diagnostic criteria to determine if this results in increased incremental validity with respect to relevant external validators. The current paper will demonstrate the capabilities of deriving and comparing the performance of overall and subgroup-specific diagnostic rules utilizing AUD as an example.

As demonstrated in the literature, broadly applying the same diagnostic algorithms may result in misdiagnosis and fail to capture the diagnostic complexity of AUD. This signifies that the application of different, or “customized,” diagnostic rules across subgroups may be potentially useful in addressing the issue of misdiagnosis. We hypothesized that customized AUD diagnostic rules for age, gender, and race/ethnicity, derived using the Stevens et al. (2018) optimization procedure, would perform better compared to a global AUD diagnostic rule derived with the same procedure with respect to predicting relevant AUD external validators. It was anticipated that there would be substantial differences in the optimal solutions among minoritized individuals when compared to the overall optimal solution given they contribute less data to the overall optimal solution. We also expected that the customized diagnoses would result in increased diagnostic discrimination.

Methods

This study utilized public-use data from the 2013 National Survey on Drug Use and Health (NSDUH), a nationally representative sample of United States civilian, non-institutionalized individuals aged 12 or older collected by the Substance Abuse and Mental Health Services Administration (SAMHSA, 2014). In total, 55,160 respondents completed the interview. Our sample was limited to those individuals 18 years of age and older resulting in 37,424 respondents. This restriction was imposed because several epidemiological and clinical studies have established that AUD diagnostic criteria have limitations when applied to adolescents (e.g., Martin et al., 1996; Martin & Winters, 1998; Winters, 2011). Further, exclusion criteria extended to those who left this item blank (n = 9, 511), and participants who did not report consuming alcohol at least 6 times in the past year (n = 3,861), those who refused to report total days of use (n = 33), and those who reported “don’t know” (n = 130), resulting in a total sample size of 23,889. Abstainers were excluded because they do not contribute meaningful variance to the outcomes. The final sample was 47% female. Participants fell into the following age groups: 18-25 (16%), 26-34 (19%), 35-49 (27%), and 50+ (39%), as defined by NSDUH. Nearly three-quarters of the sample were non-Hispanic white (71%) and 13% identified as Hispanic. We acknowledge that race and ethnicity are distinct constructs, however, this is how NSDUH groups race/ethnicity and we were therefore limited to these groupings. Many (51%) were currently married, though a large proportion had never been married (30%) or were divorced/separated (12%). A third of the sample were college graduates (36%), 29% reported “some college,” and 26% reported a high school diploma.

Measures

Clustering Item Set: AUD.

According to the DSM-5 (APA, 2013), Alcohol Use Disorder is described by 11 symptoms. These include: (1) drinking more alcohol or over a longer time period than initially intended (“larger longer”); (2) recurrent desire to cut down on alcohol use or failed attempts to control use (“cut down”); (3) spending a significant amount of time finding, consuming, or recuperating from the effects of alcohol (“time spent”); (4) powerful desire or urge to consume alcohol (“craving”); (5) failure to uphold important responsibilities at work, school, or home due to use (“failure to fulfill”); (6) continued alcohol use regardless of social or relational conflicts (“social interpersonal”); (7) important activities given up or reduced due to use (“give up”); (8) persistent use in situations where there is potential for physical harm to self or others (“hazardous use”); (9) sustained use despite awareness of a physical or psychological ailment caused or made worse by alcohol (“physical psychological”); (10) a need for larger amounts of alcohol to attain the desired effect (“tolerance”); and (11) withdrawal. As described by Boness and colleagues (2018), a “current” modified diagnosis of AUD was estimated whereby individuals endorsed at least 2 out of 10 criteria (craving not assessed) within the past 12 months.

Derivation Variable: Alcohol Consumption.

Self-reported drinking behavior was operationalized with four survey items, consistent with Boness et al. (2018). These included past 30-day quantity and frequency of consumption, past 12-month frequency of consumption, and past 30-day frequency of binge drinking (i.e., five or more drinks on the same occasion where occasion is defined as “at the same time or within a couple hours of each other”). To create a single frequency measure of alcohol consumption similar to the composite used by Stevens and colleagues (2018) in the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) data set, frequency of consumption was averaged for past 30 days and past 12 months. The resulting three variables were standardized by sex due to known sex differences in alcohol consumption. The mean of the resulting standardized variables were then summed to create a heaviness of alcohol consumption composite (α = .58 for men, α = .57 for women). Although the alpha is low, this is not surprising given the measure is not meant to be unidimensional and the variables themselves are not meant to be conceptualized as effect indicators (i.e., caused by the same latent variable; Shevlin et al., 2000). When our composite was created in NESARC, the correlation between our consumption composite and that used by Stevens et al. (2018), was r = 0.96.

This “heaviness” composite is considered appropriate for the purpose of the current study for two main reasons. First, heavy use over time is the most parsimonious construct for explaining the neurobiological changes that occur with substance use disorders and for contextualizing the varied social and physical consequences that occur in substance users (Grant et al., 2009; Rehm & Roerecke, 2013; Rehm et al., 2013). Second, heaviness of consumption shows a strong monotonic relationship with the DSM–5 criterion count and this relation is more robust than other possible alternative correlates (e.g., general functioning, psychiatric comorbidity; see Dawson, Saha, & Grant, 2010; Lane & Sher, 2015; Saha et al., 2007). Additional work has established that factor scores derived from a comparable past 12-month consumption composite are heritable, influenced by genetic factors that influence heavy drinking, and stable across time (Agrawal, Lynskey, Heath, & Chassin, 2011).

External validation.

Although an empirically-derived optimal solution is advantageous, a given optimal solution must also be associated with other measures of AUD to be clinically relevant. Therefore, the importance of external validation in this context is to establish that a clinical diagnosis, in this case the various subgroup optimal solutions, is associated with additional variables, such as those that measure similar constructs as the diagnosis (Grimm & Widamin, 2012). That is, does the optimal solution measure the construct it is intended to measure, in this case, AUD? An additional goal of the external validation was to evaluate whether there is incremental validity of the subgroup optimal solutions above and beyond the overall optimal solution with respect to the prediction of these validators. The same external validators (i.e., known correlates of AUD) used by Boness et al., (2018) were examined (see Table 3). These included variables related to past year treatment usage such as formal treatment (e.g., rehabilitation facility, visiting a counselor) and informal treatment (e.g., Alcoholics Anonymous), as well as indicators of psychopathology and disorders that commonly co-occur with AUD. Age of first drink before or at the age of 15 was also included.

Table 3.

Weighted Odds Ratios (OR) Adjusted for Gender Comparing the Overall Optimal Solution in NSDUH 2013 with the Optimal Solution for Each Subgroup

Female (N = 12,081) Male (N = 11,808) 18-25 (N = 11,918)
External Validator OR (95% CI) Δc χ2diff OR (95% CI) Δc χ2diff OR (95% CI) Δc χ2diff
Overall Subgroup Overall Subgroup Overall Subgroup
Formal Treatment 25.29 (13.96, 45.85) 31.54 (17.36, 57.29) 0.020 146364.53* 9.75 (6.39, 14.88) 10.06 (6.64, 15.24) 0.018 123162.14* 6.29 (4.25, 9.30) 5.56 (3.79, 8.15) 0.050 61815.08*
Informal Treatment 31.23 (14.38, 67.82) 37.22 (17.09, 81.06) 0.014 74428.01* 17.29 (9.74, 30.72) 17.22 (9.71, 30.52) 0.013 73488.96* 12.33 (7.30, 20.83) 8.28 (4.81, 14.24) 0.028 11828.42*
Mood Disorder 4.89 (3.58, 6.66) 4.62 (3.31, 6.47) 0.003 7580.67* 6.08 (4.39, 8.42) 6.07 (4.35, 8.47) 0.005 181844.12* 3.52 (2.81, 4.40) 2.43 (2.01, 2.94) 0.008 25998.22*
Anxiety Disorder 3.52 (2.59, 4.79) 3.05 (2.18, 4.27) 0.000 1911.95* 3.69 (2.60, 5.23) 3.37 (2.38, 4.79) 0.003 16437.08* 2.59 (2.00, 3.37) 2.19 (1.77, 2.70) 0.010 44868.25*
Suicide 4.09 (2.87, 5.82) 3.91 (2.67, 5.74) 0.007 4428.00* 4.67 (3.31, 6.60) 4.90 (3.36, 7.14) 0.003 854009.50* 2.72 (2.14, 3.45) 2.25 (1.84, 2.76) 0.016 44652.72*
Cannabis Use Disorder 6.33 (4.23, 9.47) 6.26 (4.13, 9.50) 0.016 7072.64* 5.47 (3.99, 7.50) 5.46 (4.03, 7.40) 0.012 83790.64* 3.22 (2.53, 4.11) 2.94 (2.38, 3.63) 0.019 112222.63*
Drug Use Disorder 12.68 (7.78, 20.65) 13.72 (8.33, 22.60) 0.015 51500.00* 9.53 (6.68, 13.59) 9.27 (6.52, 12.17) 0.009 98921.05* 6.90 (5.15, 9.24) 5.92 (4.54, 7.72) 0.030 138704.66*
Age of first drink ≤ 15 3.23 (2.46, 4.25) 3.22 (2.38, 4.35) 0.003 24582.44* 2.26 (1.80, 2.83) 2.41 (1.93, 3.02) 0.004 228628.20* 2.38 (1.99, 2.84) 2.19 (1.92, 2.50) 0.020 189838.61*
26-34 (N = 3,816) 35-49 (N = 4,838) 50+ (N = 3,317)
External Validator OR (95% CI) Δc χ2diff OR (95% CI) Δc χ2diff OR (95% CI) Δc χ2diff
Overall Subgroup Overall Subgroup Overall Subgroup
Formal Treatment 11.05 (5.67, 21.53) 5.94 (3.12, 11.31) 0.000 118.49* 13.75 (7.07, 26.75) 20.28 (10.66, 38.58) 0.035 257146.88* 28.76 (9.90, 83.49) 32.35 (11.62, 90.01) 0.008 158582.46*
Informal Treatment 17.84 (7.54, 42.40) 11.29 (4.76, 26.75) 0.087 13818.24* 22.63 (10.06, 50.95) 41.16 (18.24, 92.88) 0.065 242695.00* 32.17 (8.19, 126.32) 33.71 (8.67, 131.01) 0.029 74505.33*
Mood Disorder 4.27 (2.70, 6.73) 3.22 (2.21, 4.68) 0.019 68841.87* 7.15 (4.62, 11.06) 4.75 (3.15, 7.15) 0.000 4445.24* 5.61 (2.82, 11.16) 5.91 (2.96, 11.98) 0.003 160364.43*
Anxiety Disorder 3.54 (2.20, 5.71) 2.71 (1.84, 3.99) 0.012 47080.50* 4.31 (2.84, 6.56) 3.91 (2.62, 5.85) 0.004 28004.55* 2.03 (0.88, 4.69) 2.57 (1.27, 5.21) 0.003 76070.43*
Suicide 5.59 (3.24, 9.65) 3.87 (2.42, 6.19) 0.015 35393.06* 5.80 (3.59, 9.39) 5.30 (3.31, 8.50) 0.007 34813.45* 2.48 (1.17, 5.29) 3.90 (1.56, 9.74) 0.001 212878.25*
Cannabis Use Disorder 5.01 (2.53, 9.94) 5.24 (2.97, 9.24) 0.063 125450.17* 6.91 (3.26, 14.65) 4.94 (2.41, 10.14) 0.000 38.08* 2.35 (0.50, 11.11) 5.41 (1.38, 21.13) 0.053 51407.69*
Drug Use Disorder 5.07 (2.65, 9.71) 3.71 (2.06, 6.66) 0.045 28621.52* 18.92 (9.90, 36.15) 17.28 (9.07, 32.91) 0.029 42746.87* 18.08 (6.84, 47.84) 18.31 (7.05, 47.57) 0.013 73117.02*
Age of first drink ≤ 15 2.71 (1.91, 3.84) 1.94 (1.49, 2.53) 0.014 32691.61* 2.81 (1.98, 4.00) 3.44 (2.51, 4.71) 0.004 259434.68* 1.76 (1.01, 3.05) 1.81 (1.09, 3.00) 0.002 47629.33*
White (N = 15,653) Black (N = 2,785) Hispanic (N = 3,477)
External Validator OR (95% CI) Δc χ2diff OR (95% CI) Δc χ2diff OR (95% CI) Δc χ2diff
Overall Subgroup Overall Subgroup Overall Subgroup
Formal Treatment 18.44 (11.77, 28.91) 21.04 (13.51, 32.77) 0.022 178567.54* 8.56 (3.25, 22.55) 7.41 (2.72, 20.19) 0.016 6535.07* 9.93 (3.63, 27.15) 8.26 (3.09, 22.10) 0.017 33516.00*
Informal Treatment 31.82 (17.76, 57.00) 31.82 (17.91, 56.55) 0.010 63937.74* 13.24 (3.02, 57.96) 13.42 (3.10, 58.14) 0.091 12880.49* 9.76 (2.32, 41.11) 11.35 (3.08, 41.88) 0.024 64883.04*
Mood Disorder 5.16 (3.92, 6.80) 5.31 (4.02, 7.02) 0.004 106247.53* 4.50 (2.36, 8.59) 4.63 (2.46, 8.70) 0.011 25074.98* 6.13 (3.45, 10.89) 6.65 (3.45, 12.80) 0.004 164766.79*
Anxiety Disorder 3.47 (2.64, 4.56) 3.13 (2.37, 4.12) 0.001 502.99* 4.14 (1.82, 9.40) 3.63 (1.71, 7.73) 0.010 4549.24* 4.04 (2.25, 7.24) 3.70 (2.09, 6.55) 0.010 27180.89*
Suicide 3.84 (2.82, 5.23) 3.80 (2.79, 5.18) 0.004 26372.30* 7.62 (3.76, 15.45) 7.99 (3.97, 16.07) 0.029 45929.45* 5.03 (2.69, 9.39) 5.86 (2.69, 12.73) 0.015 130702.89*
Cannabis Use Disorder 4.86 (3.50, 6.75) 6.28 (4.58, 8.61) 0.009 212428.26* 5.32 (2.91, 9.72) 4.22 (2.41, 7.37) 0.009 1201.68* 8.03 (4.34, 14.88) 5.95 (3.17, 11.14) 0.011 20374.17*
Drug Use Disorder 8.22 (5.84, 11.57) 8.78 (6.26, 12.31) 0.010 91204.02* 12.76 (4.70, 34.64) 18.34 (7.06, 47.63) 0.051 97309.88* 30.79 (15.41, 61.49) 30.45 (15.16, 61.16) 0.028 194904.85*
Age of first drink ≤ 15 2.60 (2.08, 3.25) 2.91 (2.33, 3.64) 0.001 214168.46* 3.06 (1.86, 5.02) 3.11 (1.91, 5.06) 0.004 32405.31* 2.38 (1.58, 3.58) 2.58 (1.67, 3.97) 0.007 110340.31*

Note. The male and female groups do not include gender in the model. χ2diff refers to the Chi-Square difference test between the reduced model (i.e., the overall optimal solution controlling for gender) and the full model (i.e., the reduced model plus the subgroup optimal solution). Weighted refers to the application of NSDUH’s person-level sampling weights to estimate the given statistics. The measure “c” is equivalent to ROC and ranges from 0.5 to 1. Change in c is the full model minus the reduced model.

*

p < .001

Although, conceivably, it could be argued that some of these external validators are themselves plagued by issues related to gender-, age-, or race-related health disparities (e.g., differences in utilization of treatment services, prevalence of other co-occurring disorders), we were limited by those variables assessed by NSDUH. Age of first drink might, therefore, be the most objective indicator and the variable least likely to be impacted by these disparities, especially given age of first drink has a strong relationship with alcohol use disorder across groups (e.g., Dawson et al., 2008; Hingson, Heeren, & Winter, 2006).

Optimization Procedure

The deterministic optimization approach employed includes 9 steps with the overall aim of locating globally optimal solutions automatically without the concern of sampling variability. The details of this approach have also been previously described by Stevens et al. (2018). In the current study, the optimization procedure was performed separately on (1) the full sample, and (2) each of the individual subgroups (i.e., nine groups in total: males, females, 18-25, 26-34, 35-49, 50+, non-Hispanic white, non-Hispanic Black, and Hispanic). That is, the optimization procedure was implemented a total of ten times, allowing for the comparison of diagnostic rules. The flow-chart provided in Figure 1 depicts the steps of the optimization algorithm described below.

Figure 1.

Figure 1.

Flow-Chart of Steps Used to Obtain Overall and Subgroup Specific Diagnostic Rules. This figure illustrates the steps of the optimization procedure described in the Methods section of the text. DV = deration variable; AUD = Alcohol Use Disorder; AUD μD = cluster of those diagnosing under rule; AUD μND = cluster of those not diagnosing under rule.

Step 1: Select the data set.

Each of the 10 data sets were subjected to the optimization procedure separately. The example provided in Figure 1 is utilizing the full NSDUH sample, although the steps extend to all of the individual subgroup data sets used in the current manuscript.

Step 2: Create folds for cross-validation.

Consistent with the recommendations of Rodriguez, Perez, and Lozano (2010), the full data set was randomly divided into five non-overlapping folds (i.e., subsets of the data). This is an important cross-validation method commonly used in optimization procedures to reduce over-fitting to the full dataset (e.g., Stone, 1974). For this step, one fold is deemed as the “test data,” while the remaining four folds are labeled as the “training data.” It should be noted, however, that although a standard 5-fold cross-validation approach is used in the current demonstration, other cross-validation approaches (e.g., training on current data and validating on a comparable data set) may be preferred in cases where external and harmonizable data exists.

Step 3: Assess the separation of diagnostic clusters using complete enumeration.

Step 3 was performed on both the training and test data sets. First, the DV (i.e., consumption), item set (i.e., clustering variables), and base rate were specified a priori. Declaring a base rate in at least one training sample is necessary to ensure that the program does not select an optimal solution only separating those endorsing all symptoms versus those endorsing no symptoms. A solution cannot be selected as the optimal solution if it falls below the defined base rate in all training samples. Requiring this constraint within at least one training sample allows for variation below the declared base rate. This is important because we do not view the input base rate as a hard cutoff for identifying an optimal solution, but rather a rough estimate of the diagnostic rate within the population informed by prior research.

Our item set included the modified DSM-5 AUD symptoms described above, resulting in a total of 10 items. For the full sample, AUD minimum base rate was set at 11.52%, consistent with the estimated weighted DSM-IV AUD (abuse or dependence) prevalence rate in NSDUH 2013 for those who are 18 years of age or older that have consumed alcohol at least 6 times in the past year. The AUD base rates for the demographic subgroups varied from 6.66% to 19.77% (see Table 2). Setting a base rate ensures that, in at least one fold of the data, a minimum of 11.52% (in the case of the full sample) of the observations diagnose with the given diagnostic algorithm. This also allows an understanding of how the diagnosis performs at this proportion or above within the distribution. DSM-IV was chosen over DSM-5 to inform the selected base rate given NSDUH does not include an assessment of craving, one of the DSM-5 criteria.

Table 2.

Summary of Overall and Subgroup Optimal Solutions and Resulting Prevalence Rates

Sample N DSM-IV AUD Base Rate Larger Longer Cut Down Time Spent Failure to Fulfil Social Interpersonal Give Up Hazardous Use Physical Psychological Tolerance Withdrawal threshold set size Weighted Prevalence Unweighted Prevalence Correlation with “Full” Solution
Full 23,889 11.52 X X X X X X 3 6 5.16 7.19

Gender
Male 11,808 14.02 X X X X X X 3 6 7.03 9.17 0.88
Female 12,081 8.69 X X X X X X X X X X 4 10 3.46 5.02 0.79

Age
18-25 11,918 19.77 X X X X X 2 5 18.08 18.04 0.65
26-34 3,816 15.43 X X X X X X X 2 7 13.43 13.13 0.69
35-49 4,838 10.94 X X X X X X X X 3 8 5.86 6.72 0.81
50+ 3,317 6.66 X X X X X X X 3 7 3.43 4.01 0.88

Race/Ethnicity
Non-Hispanic, White 15,653 10.94 X X X X X X X 3 7 4.78 6.91 0.86
Non-Hispanic, Black 2,795 11.09 X X X X X X 2 6 4.66 5.87 0.69
Hispanic 3,477 14.31 X X X X X 2 5 7.22 8.25 0.64

Note. “DSM-IV AUD Base Rate” and “Prevalence” are weighted. The “Full” sample solution refers to the “overall optimal solution.” DSM-IV = Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition. Larger Longer = drinking more alcohol or over a longer time period than initially intended; Cut Down = recurrent desire to cut down on alcohol use or failed attempts to control use; Time Spent = spending a significant amount of time finding, consuming, or recuperating from the effects of alcohol; Craving = powerful desire or urge to consume alcohol; Failure to Fulfil = failure to uphold important responsibilities at work, school, or home due to use; Social Interpersonal = continued alcohol use regardless of social or relational conflicts; Give Up = important activities given up or reduced due to use; Hazardous Use = persistent use in situations where there is potential for physical harm to self or others; Physical Psychological = sustained use despite awareness of a physical or psychological ailment caused or made worse by alcohol; Tolerance = a need for larger amounts of alcohol to attain the desired effect.

The optimization procedure applied here, therefore, assessed every combination of the 10 items (i.e., the full “set”), varying set size (i.e., number of criteria considered for a diagnosis) and diagnostic threshold (i.e., the number of criteria needed to diagnose based on those in the set). This resulted in a total of 210 – 1 = 1,023 different combinations, not including the empty set, to be examined at a threshold of one. Table 1 summarizes the number of combinations within each set size and threshold. When the threshold is varied, this results in a total of 5,120 possible diagnostic rules. Each rule groups observations into two clusters: a) those diagnosing under the given rule and b) those not diagnosing under the given rule. These diagnostic clusters are informed by the given set size and diagnostic threshold (out of 5,120 possible combinations). For example, if the combination under examination had a diagnostic threshold of two with a set composed of three criteria: 1) drinking more or longer than intended, 2) tolerance, and 3) withdrawal, then those endorsing at least two of the three criteria will be place in the diagnostic group and those failing to endorse at least two symptoms will be placed in the non-diagnostic group. Statistics, such as the mean, on the DV (i.e., consumption) can then be obtained from the two clusters (as demonstrated in Figure 1).

Table 1.

All Possible Diagnostic Item Set Combinations by Criteria Set Size and Diagnostic Threshold

Criteria Set Size Diagnostic Threshold

1 2 3 4 5 6 7 8 9 10 Total

1 10 10
2 45 45 90
3 120 120 120 360
4 210 210 210 210 840
5 252 252 252 252 252 1260
6 210 210 210 210 210 210 1260
7 120 120 120 120 120 120 120 840
8 45 45 45 45 45 45 45 45 360
9 10 10 10 10 10 10 10 10 10 90
10 1 1 1 1 1 1 1 1 1 1 10

Total 1023 1013 968 848 638 386 176 56 11 1 5120

Note. Given craving was not available in NSDUH, the maximum diagnostic threshold and criteria set size was 10.

Based on the statistic of interest, the separation between clusters can then be assessed utilizing a measure of distance. Conceivably, any range of distance measures could be used. Here, Cohen’s d was utilized to characterize the separation between the resulting diagnostic clusters based on consumption. The complete enumeration of all rules allows for an assessment of the degree of separation between the mean levels of consumption between those diagnosing with AUD and those not diagnosing with AUD across all 5,120 possible combinations and, therefore, providing 5,120 estimates of Cohen’s d (see Table 1).

Step 4: Repeat steps 2 and 3.

Steps 2 and 3 are repeated such that each fold is designated as the test fold once. This results in five estimates of Cohen’s d for each of the 5,120 diagnostic rules. Diagnostic rules were considered eligible as the optimal solution if the prevalence of AUD was greater than or equal to the pre-specified base rate of 11.52% (in the case of the full sample) in at least one of the training samples.

Step 5: Obtain average separation measures across training and test data sets.

The values of Cohen’s d for the training and test data sets were averaged across the five folds, resulting in an average fold training Cohen’s d and an average fold test Cohen’s d.

Step 6: Repeat steps 2 thru 5.

To ensure that the resulting Cohen’s d values were not determined strictly by how the folds were created in Step 2, the optimization was repeated in a series of iterations. Steps 2-5 were repeated for 100 iterations, resulting in 100 average fold estimates of both the training and test Cohen’s d.

Step 7: Obtain all estimates of separation across the iterations.

After Step 6, there were 100 average estimates of the training and test Cohen’s d for each diagnostic rule. These estimates were then averaged, resulting in an overall average training and test Cohen’s d. The maximum value of Cohen’s d in the test sample therefore provided the optimal diagnostic rule given the constraint placed on the training sample requiring the solution meet the pre-specified base rate.

Step 8: Obtain optimal diagnostic rules for all subgroups.

Steps 1 through 7 were repeated across the nine demographic subgroups separated by gender (male, female), age (18-25, 26-34, 35-49, 50+), and race/ethnicity (non-Hispanic white, non-Hispanic Black, Hispanic) to determine each subgroup-specific optimal solution.

Step 9: External validation.

The performance of the full sample optimal solution compared to the subgroup optimal solutions was examined via the use of the external validators previously described. SAS PROC SURVEYLOGISTIC was used to estimate weighted odds ratios (ORs) predicting the relevant validators across the ten optimal solutions. Person-level sampling weights were applied to account for NSDUH’s independent, multistage area probability sample design (SAMHSA, 2014). To compare the overall optimal solution with the subgroup solutions, we also estimated Δc and χ2diff. χ2diff refers to the Chi-Square difference (i.e., a measure a model fit) between the reduced model (i.e., the overall optimal solution controlling for gender) and the full model (i.e., the reduced model plus the subgroup optimal solution), resulting in a one degree of freedom test. The measure “c” is equivalent to ROC and ranges from .5 to 1, where .5 corresponds to the model randomly predicting the response and 1 corresponds to the model perfectly discriminating the response. Change in c (Δc) is the full model minus the reduced model.

Results

Table 2 presents the optimal solutions for the overall (i.e., full) sample and for each of the subgroups by demographic characteristic (i.e., gender, age, race/ethnicity). The overall optimal solution indicated a set size of six criteria with a diagnostic threshold of three from the following: (1) attempts to quit or cut down, (2) a significant amount of time spent using, obtaining, or recovering from alcohol, (3) use in situations that were potentially hazardous, (4) continuing to drink despite physical or psychological problems, (5) tolerance, and (6) withdrawal. Note that this optimal solution is different from that presented by Boness et al. (2018) - which found the overall optimal solution included a set size of nine with a diagnostic threshold of three - for two reasons. First, Boness et al., utilized NSDUH 2010 and 2013 to find the average optimal solution between the two data sets using a variation of the current optimization procedure. Second, Boness et al., excluded participants under age 21 while the current study used those 18 or older. (Boness et al. was intended to extend the work of Stevens et al., 2018 which was restricted to those 21 years and older by NESARC Wave 2.) Given NSDUH includes those under age 18, we expanded the age range for the current study. The subgroup optimal solutions vary markedly. The criteria set sizes ranged from five to ten criteria (M = 6.70) and the diagnostic thresholds ranged from two the four (M = 2.70). The subgroup optimal solutions were all highly correlated with the overall optimal solution (M = 0.73; range = 0.64-0.88; see Table 2).

Results from the external validation (Table 3) demonstrated that “customized” subgroup diagnostic rules performed about as well as, and, at times, better (indicated by larger odds ratios [ORs] for the subgroup solutions), than the overall optimal solution in the prediction of relevant AUD correlates. Subgroup optimal solutions typically resulted in increased discriminatory power and increased model fit when compared to the overall optimal solution (indicated by Δc and χ2diff). Overall, the subgroup optimal solutions demonstrated incremental validity above and beyond the overall optimal solution.

For example, among females, Table 3 demonstrates that the female-specific optimal solution predicted higher odds of formal and informal treatment history, and drug use disorder. However, the female-specific optimal solution was not superior to the overall optimal solution in the prediction of a mood disorder, anxiety disorder, suicide history, cannabis use disorder, or age of first drink being before age 16. When looking specifically at Δc and χ2diff, however, the female-specific optimal solution was always superior or equal to the overall optimal solution when predicting these outcomes (although to varying degrees; i.e., Δc ranged from 0.0% to 2.0%). These trends, as they relate to Δc and χ2diff, largely held over all sub-group specific optimal solutions in the prediction of external validators.

When we compared the overall optimal solution with DSM-IV and the modified DSM-5 (excluding craving) diagnoses, agreement was moderate (DSM-IV: Kappa = 0.59, Phi = 0.65; modified DSM-5: Kappa = 0.52, Phi = 0.59). However, when we examined the tetrachoric correlations, which consider base rate, agreement was much higher (DSM-IV: tetrachoric correlation = 0.99; modified DSM-5: tetrachoric correlation = 0.99). It should also be noted that although the external validation indicates increased predictability of subgroup specific diagnostic rules, the confidence intervals around the ORs are quite large due to predicting rare events within NSDUH (e.g., treatment seeking Black individuals). Comparing the overall and subgroup optimal solutions to DSM-5 in NSDUH 2013 and an external data set (NSDUH 2010; N = 25, 034) demonstrated the solutions were consistently highly correlated with DSM-5 (see Table 4). Further, subgroup optimal solution correlations with DSM-5 were similar across both data sets. Within NSDUH 2010, correlations with DSM-5 ranged from 0.54 to 0.87 compared to 0.53 to 0.88 within NSDUH 2013.

Table 4.

Correlations between Optimal Solutions Derived from NSDUH 2013 and DSM-5 Across Data Sets

DSM-5
Optimal Solution NSDUH 2010 (N = 25,034) NSDUH 2013 (N = 23,889)
Full 0.59 0.59
Male 0.62 0.61
Female 0.54 0.54
18-25 0.83 0.84
26-34 0.87 0.88
35-49 0.67 0.68
50+ 0.67 0.66
Non-Hispanic, White 0.61 0.60
Non-Hispanic, Black 0.56 0.53
Hispanic 0.60 0.59

Note. DSM-5 corresponds to the modified DSM-5 diagnosis which excludes craving. The “Full” sample solution refers to the “overall optimal solution.” DSM-5 = Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition.

Discussion

Findings demonstrate the utility of using deterministic optimization, and specifically complete enumeration, for deriving AUD diagnostic algorithms. In this case, there is statistical support (i.e., incremental validity) for customized diagnostic algorithms derived via empirical optimization across the demographic subgroups of age, gender, and race/ethnicity. However, the increased incremental validity was not substantial and, therefore, calls into question the clinical significance of these findings. Though, statistically, the subgroup-specific diagnoses may be preferred on the basis of incremental validity, the increased clinician burden of applying a separate diagnostic rule based on demographic subgroup may make this argument less convincing.

These results provide an important extension of Boness et al. (2018) in that they demonstrate customized diagnostic rules result in incremental validity above and beyond a “one-size-fits-all,” global diagnostic rule. From the perspective of Finn (1982), this also suggests increased diagnostic accuracy if these customized rules are applied. However, consideration of the magnitude of incremental validity along with clinician burden seems to suggest that a “one-size-fits all,” global diagnostic rule derived via deterministic optimization may still be superior. Further, the reduction from ten items to six in the case of the overall solution still provides evidence for increased parsimony and efficiency. The empirical optimization approach, as a whole, provides a significant improvement over the clinical-consensus process used by traditional nosologic systems (e.g., DSM, ICD) and demonstrates the usefulness of utilizing statistical approaches for the derivation of diagnostic criteria sets and thresholds (for a further discussion see Boness et al., 2018).

Moreover, even if the results were more compelling, development of population-specific, diagnostic criteria would be limited by additional concerns. First, it is not clear how best to define subpopulations given the intersectionality (e.g., combinations of gender, age, and ethnicity) that characterizes individuals. That is, there is a very large set of possible subpopulations to consider and how to define the most useful ones for developing customized diagnostic rules is less than clear. Further, these results tell the reader little about how to combine information about age, gender, and race/ethnicity when making diagnoses (e.g., which diagnostic rule should one choose for a Black female?). Although it is certainly possible to optimize on subgroups defined by multiple demographic characteristics, these analyses were not possible in the current sample due to sparse sample sizes when considering multiple demographic characteristics. However, such approaches could be possible if sufficient numbers of individuals exist across multiple, harmonizable databases. Regardless, this still begs several questions including how to “categorize” individuals into these subgroups, how to address non-binary individuals (e.g., those that identify as transgender), and who should categorize individuals into these groups (e.g., should the clinician categorize the patient or should the patient self-select into their preferred group?). Of further consideration, the types of consequences that are implicit in some DSM criteria (e.g., drinking despite impaired role functioning, hazardous use in the form of drinking and driving) are highly contextual (e.g., implying being part of a family or being employed, use of an automobile) and, therefore, introduce another limitation to the generalizability of diagnostic criteria across diverse individuals (e.g., Martin, Langenbucher, Chung, & Sher, 2014). Thus, although diversity beyond the broad distinctions made in our analyses can certainly be entertained, such efforts are complex both conceptually and present major empirical challenges (i.e., having sufficiently large samples to conduct relevant analyses). Future work should address these unanswered questions.

This is not to say that such “customized” diagnostic rules could not be useful. A potential clinical application of these customized diagnostic rules, consistent with the use of racial/ethnic-specific reference ranges in various clinical medicine diagnostic tests (e.g., Rappoport et al., 2018), would be to assess an individual using the full DSM-5 AUD criteria set (or full overall optimal solution) and then apply optimization-derived sub-algorithms (i.e. subset of criteria and diagnostic thresholds) that differ by their given demographic characteristics. This practice would serve to maintain a standard assessment across individuals, rather than requiring the clinician to know the exact subgroup diagnostic rule upfront, which would arguably result in considerable clinician burden. A subgroup-specific diagnosis could then be determined by evaluating the individual’s symptoms with respect to the applicable subgroup diagnostic rule. This could be either primary or supplemental to a fixed diagnostic rule. Subgroup-specific diagnoses are also consistent with the spirit of precision medicine whereby diagnosis is tailored to specific population characteristics. An approach such as the one described here would allow for future refinement of diagnosis using more complete assessments. However, the efficiency of a criteria set composed of four criteria would be lost. That is, although the overall optimal solution contained only four criteria, each of the 10 candidate criteria appeared in at least two population-specific solutions. Having an omnibus criteria set that would allow “secondary” specific diagnosis would require the entire candidate criteria set be assessed. If the goal of optimization is to create efficient criteria sets that retain the validity of larger sets while being easier to use in the clinic and impose less clinician and patient burden, customized diagnoses, even as an adjunct, might not be worth the time and effort. Still, we note that the approach of subgroup customization has already been utilized in the realm of consumption measures (e.g., U.S. Dept. of Health, 2015) where criteria for risky drinking, weekly limits, and binge drinking are adjusted based on sex (e.g., 4+ drinks for females v. 5+ drinks for males) and such adjustments have also been proposed for age in the case of binge drinking (Donovan, 2009). However, in the case of consumption, the same items can be used across subpopulations, albeit with differing thresholds.

It should be noted that resulting optimal diagnostic rules are dependent upon several factors. First, the clustering variables are an important factor in the algorithm. In the current application, we used 10 of the 11 DSM-5 AUD criteria. However, the inclusion of craving or any other non-criterial symptom (e.g., chronicity) would likely produce markedly different solutions. To address this more concretely, a simulation examining the same optimization procedure is presented in Stevens et al., (2018). This simulation demonstrated that although vastly different rules may be produced when a) noise is added to the clusters and b) there is a high degree of overlap on the derivation variable, agreement statistics indicated that the diagnostic rules are classifying people nearly identically. Although the selected diagnostic rule may not initially appear to be robust given it is likely the change, the simulation in Stevens et al., (2018) demonstrated that the resulting cluster assignments were robust. Similarly, changes to any decisions made a priori (e.g., the derivation variable, objective function, sampling composition, base rate) will inevitably result in different optimal diagnostic rules. This does not mean that the resulting diagnostic rule is not robust, but instead suggests that we should examine the robustness of the resulting cluster structure and its association with relevant external validators.

Conclusions

The derivation of diagnostic rules using a deterministic (or exact) optimization approach may be a viable technique for achieving empirically-derived diagnostic algorithms. Although it is reassuring that the overall optimal solution, based on the full sample, resulted in a robust diagnostic algorithm that performed well across a range of individuals, there are still several practical problems that need to be addressed and clarified before this approach to deriving diagnoses is preferred for specific demographic subgroups. Although the finding that the overall optimal solution can reduce the set size (from 10 diagnostic criteria in the modified DSM-5 algorithm to six for the overall optimal solution) provides preliminary evidence for some of the potential benefits of this approach (e.g., decreased clinician burden when compared to DSM-5), further validation at the subgroup level is required. For those interested in the use of customized diagnostic algorithms for demographic subgroups, it appears that there are significant challenges that must first be overcome before the field is likely to benefit from such an approach.

Acknowledgments

Cassandra L. Boness, Jordan E. Stevens, Douglas Steinley, Timothy Trull, and Kenneth J. Sher, Department of Psychological Science, University of Missouri. The present study was supported by the NIH grants F31AA026177, K05AA017242, T32AA013526, and R01AA024133.

Abbreviations

ADHD

Attention Deficit/Hyperactivity Disorder

AUD

Alcohol Use Disorder

CIDI

Composite International Diagnostic Interview

DSM-IV

Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition

DSM-5

Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition

IRT

Item Response Theory

NESARC

National Epidemiologic Survey on Alcohol and Related Conditions

NSDUH

National Survey on Drug Use and Health

OR

Odds ratio

PTSD

Post-Traumatic Stress Disorder

SAMHSA

Substance Abuse and Mental Health Services Administration

SAS

Statistical Analysis System

Footnotes

All documented syntax files and executable functions are provided at the following link: https://github.com/jes9bc/Complete-Enumeration.

References

  1. Agrawal A, Lynskey MT, Heath AC, & Chassin L (2011). Developing a genetically informative measure of alcohol consumption using past-12-month indices. Journal of Studies on Alcohol and Drugs, 72, 444–452. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. American Psychiatric Association. (2013a). Diagnostic and statistical manual of mental disorders (5th Ed.). Arlington: VA: American Psychiatric Publishing. [Google Scholar]
  3. American Psychiatric Association. (2013b). DSM-5 and diagnoses for children. Retrieved from https://www.psychiatry.org/File%20Library/Psychiatrists/Practice/DSM/APA_DSM-5-Diagnoses-for-Children.pdf [Google Scholar]
  4. Balsis S, Gleason ME, Woods CM, & Oltmanns TF (2007). An item response theory analysis of DSM-IV personality disorder criteria across younger and older age groups. Psychology and Aging, 22(1), 171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Boness CL, Stevens JE, Steinley D, Trull T, & Sher KJ (2018). Deriving alternative criteria sets for alcohol use disorders using statistical optimization: Results from the National Survey on Drug Use and Health. Experimental and Clinical Psychopharmacology, 27(3), 283–296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Dawson DA, Goldstein RB, Patricia Chou S, June Ruan W, & Grant BF (2008). Age at first drink and the first incidence of adult-onset DSM-IV alcohol use disorders. Alcoholism: Clinical and Experimental Research, 32(12), 2149–2160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Dawson DA, Saha TD, & Grant BF (2010). A multidimensional assessment of the validity and utility of alcohol use disorder severity as determined by item response theory models. Drug and Alcohol Dependence, 107, 31–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Donovan JE (2009). Estimated blood alcohol concentrations for child and adolescent drinking and their implications for screening instruments. Pediatrics, 123(6), e975. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Finn SE (1982). Base rates, utilities, and DSM-III: Shortcomings of fixed-rule systems of psychodiagnosis. Journal of Abnormal Psychology, 91(4), 294–302. [DOI] [PubMed] [Google Scholar]
  10. Frances AJ, & Widiger T (2012). Psychiatric diagnosis: Lessons from the DSM-IV past and cautions for the DSM-5 future. Annual Review of Clinical Psychology, 8, 109–130. [DOI] [PubMed] [Google Scholar]
  11. Grant BF, & Weissman MM (2007). Gender and the prevalence of psychiatric disorders In: Narrow WE, First MB, Sirovatka PJ, & Regier DA (Eds.). Age and gender considerations in psychiatric diagnosis: A research agenda for DSM-V (pp. 31–46). Washington, DC: American Psychiatric Association. [Google Scholar]
  12. Grant JD, Agrawal A, Bucholz KK, Madden PA, Pergadia ML, Nelson EC,… Heath AC. (2009). Alcohol consumption indices of genetic risk for alcohol dependence. Biological Psychiatry, 66, 795–800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Greif Green J, Gruber MJ, Kessler RC, Lin JY, McLaughlin KA, Sampson NA, … & Alegria M (2012). Diagnostic validity across racial and ethnic groups in the assessment of adolescent DSM-IV disorders. International Journal Methods Psychiatric Research, 21(4), 311–320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Grimm KJ, & Widamin KF (2012). Construct validity. In: Cooper J, Camic PM, Long AT, Panter T, Rindskopf D, & Sher KJ (Ed.), APA handbook of research methods in psychology, Vol. 1. Foundations, planning, measures, and psychometrics (pp. 621–642). [Google Scholar]
  15. Harford TC, Yi HY, Faden VB, & Chen CM (2009). The dimensionality of DSM-IV alcohol use disorders among adolescent and adult drinkers and symptom patterns by age, gender, and race/ethnicity. Alcoholism: Clinical and Experimental Research, 33(5), 868–878. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hartung CM, & Widiger TA (1998). Gender differences in the diagnosis of mental disorders: Conclusions and controversies of the DSM–IV. Psychological Bulletin, 123(3), 260. [DOI] [PubMed] [Google Scholar]
  17. Hasin DS, O’Brien CP, Auriacombe M, Borges G, Bucholz K, Budney A, … Grant BF (2013). DSM-5 criteria for substance use disorders: Recommendations and rationale. American Journal of Psychiatry, 170(8), 834–851. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Hingson RW, Heeren T, & Winter MR (2006). Age at drinking onset and alcohol dependence: age at onset, duration, and severity. Archives of Pediatrics & Adolescent Medicine, 160(7), 739–746. [DOI] [PubMed] [Google Scholar]
  19. Hoertel N, Peyre H, Wall MM, Limosin F, & Blanco C (2014). Examining sex differences in DSM-IV borderline personality disorder symptom expression using Item Response Theory (IRT). Journal of Psychiatric Research, 59, 213–219. [DOI] [PubMed] [Google Scholar]
  20. Lane SP, & Sher KJ (2015). Limits of current approaches to diagnosing severity based on criterion counts: An example with DSM–5 alcohol use disorder. Clinical Psychological Science, 3, 819–835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Lane SP, Steinley D, & Sher KJ (2016). Meta-analysis of DSM alcohol use disorder criteria severities: structural consistency is only “skin deep.” Psychological Medicine, 46(8), 1769–1784. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Martin CS, Langenbucher JW, Chung T, & Sher KJ (2014). Truth or consequences in the diagnosis of substance use disorders. Addiction, 109(11), 1773–1778. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Martin CS, Langenbucher JW, Kaczynski NA, & Chung T (1996). Staging in the onset of DSM-IV alcohol symptoms in adolescents: survival/hazard analyses. Journal of Studies on Alcohol, 57(5), 549–558. [DOI] [PubMed] [Google Scholar]
  24. Martin CS, & Winters KC (1998). Diagnosis and assessment of alcohol use disorders among adolescents. Alcohol Health and Research World, 22, 95–105. [PMC free article] [PubMed] [Google Scholar]
  25. McLean CP, Asnaani A, Litz BT, & Hoffman SG (2011). Gender differences in anxiety disorders: Prevalence, course of illness, comorbidity and burden of illness. Journal of Psychiatric Research, 45(8), 1027–1035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Raffo CD, Hasin DS, Appelbaum P, & Wall MM (2019). A data-driven method for identifying shorter symptom criteria sets: The case for DSM-5 alcohol use disorder. Psychological Medicine, 49(6), 931–939. [DOI] [PubMed] [Google Scholar]
  27. Rappoport N, Paik H, Oskotsky B, Tor R, … & Butte AJ. (2017). Creating ethnicity-specific reference intervals for lab tests from EHR data. bioRxiv, 213892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Rehm J, Marmet S, Anderson P, Gual A, Kraus L, Nutt DJ,… Gmel G (2013). Defining substance use disorders: Do we really need more than heavy use? Alcohol and Alcoholism, 48, 633–640. [DOI] [PubMed] [Google Scholar]
  29. Rehm J, & Roerecke M (2013). Reduction of drinking in problem drinkers and all-cause mortality. Alcohol and Alcoholism, 48, 509–513. [DOI] [PubMed] [Google Scholar]
  30. Riolo SA, Nguyen TA, Greden JF, & King CA (2005). Prevalence of depression by race/ethnicity: findings from the National Health and Nutrition Examination Survey III. American Journal of Public Health, 95(6), 998–1000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Rodriguez JD, Perez A, & Lozano JA (2010). Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 569–575. [DOI] [PubMed] [Google Scholar]
  32. Saha TD, Stinson FS, & Grant BF (2007). The role of alcohol consumption in future classifications of alcohol use disorders. Drug and Alcohol Dependence, 89, 82–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Shevlin M, Miles JN, Davies MN, & Walker S (2000). Coefficient alpha: A useful indicator of reliability? Personality and Individual Differences, 28(2), 229–237. [Google Scholar]
  34. Srisurapanont M, Kittiratanapaiboon P, Likhitsathian, … & Junsirimongkol B (2012). Patterns of alcohol dependence in Thai drinkers: A differential item functioning analysis of gender and age bias. Addictive Behaviors, 37(2), 173–178. [DOI] [PubMed] [Google Scholar]
  35. Steinley D, Lane SP, & Sher KJ (2016). Determining optimal diagnostic criteria through chronicity and comorbidity. In Silico Pharmacology, 4(1), 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Stevens JE, Steinley D, Boness CL Trull T, Wood P, & Sher KJ. (2018). Combinatorial optimization of classification decisions: An application to refine psychiatric diagnoses. 10.31234/OSF.IO/JPNMF [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Stevens JE, Steinley D, McDowell YE, Boness CL, Trull TJ, Martin CS, & Sher KJ (2019). Toward more efficient diagnostic criteria sets and rules: The use of optimization approaches in addiction science. Addictive Behaviors, (February), 0–1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Stone M (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society. Series B (Methodological), 111–147. [Google Scholar]
  39. Substance Abuse and Mental Health Services Administration (SAMHSA). (2014). Results from the 2013 National Survey on Drug Use and Health: Summary of national findings, NSDUH Series H-48, HHS Publication No.14-4863. Rockville, MD: SAMHSA, 13–4795. [Google Scholar]
  40. U.S. Dept. of Health and Human Services & U.S. Dept. of Agriculture. (2015). Dietary Guidelines for Americans. Retrieved from http://health.gov/dietaryguidelines/2015/guidelines/
  41. Wagner EF, Lloyd DA, & Gil AG (2002). Racial/ethnic and gender differences in the incidence and onset age of DSM-IV alcohol use disorder symptoms among adolescents. Journal of Studies on Alcohol, 63(5), 609–619. [DOI] [PubMed] [Google Scholar]
  42. Wakefield JC (2015). DSM-5 substance use disorder: How conceptual missteps weakened the foundations of the addictive disorders field. Acta Psychiatrica Scandinavica, 132(5), 327–334. [DOI] [PubMed] [Google Scholar]
  43. Winters KC (2011). Commentary on O’Brien: Substance use disorders in DSM-V when applied to adolescents. Addiction, 106(5), 882–884. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES