Item-level genome-wide association study of the Alcohol Use Disorders Identification Test in three population-based cohorts

Travis T Mallard; Jeanne E Savage; Emma C Johnson; Yuye Huang; Alexis C Edwards; Jouke J Hottenga; Andrew D Grotzinger; Daniel E Gustavson; Mariela V Jennings; Andrey Anokhin; Danielle M Dick; Howard J Edenberg; John R Kramer; Dongbing Lai; Jacquelyn L Meyers; Ashwini K Pandey; Kathryn Paige Harden; Michel G Nivard; Eco JC de Geus; Dorret I Boomsma; Arpana Agrawal; Lea K Davis; Toni-Kim Clarke; Abraham A Palmer; Sandra Sanchez-Roige

doi:10.1176/appi.ajp.2020.20091390

. Author manuscript; available in PMC: 2023 Jan 1.

Published in final edited form as: Am J Psychiatry. 2021 May 14;179(1):58–70. doi: 10.1176/appi.ajp.2020.20091390

Item-level genome-wide association study of the Alcohol Use Disorders Identification Test in three population-based cohorts

Travis T Mallard ¹, Jeanne E Savage ², Emma C Johnson ³, Yuye Huang ⁴, Alexis C Edwards ⁵, Jouke J Hottenga ⁶, Andrew D Grotzinger ¹, Daniel E Gustavson ⁷, Mariela V Jennings ⁴, Andrey Anokhin ³, Danielle M Dick ⁸, Howard J Edenberg ⁹, John R Kramer ¹⁰, Dongbing Lai ¹¹, Jacquelyn L Meyers ¹², Ashwini K Pandey ¹², Kathryn Paige Harden ¹, Michel G Nivard ⁶, Eco JC de Geus ⁶, Dorret I Boomsma ⁶, Arpana Agrawal ³, Lea K Davis ^7,^13,¹⁴, Toni-Kim Clarke ¹⁵, Abraham A Palmer ^4,¹⁶, Sandra Sanchez-Roige ^4,⁷

PMCID: PMC9272895 NIHMSID: NIHMS1814273 PMID: 33985350

Abstract

Objective.

Genome-wide association studies (GWASs) of the Alcohol Use Disorders Identification Test (AUDIT), a ten-item screener for alcohol use disorders (AUD), have elucidated novel loci for alcohol consumption and misuse. However, these studies also revealed that GWASs can be influenced by numerous biases (e.g., measurement error, selection bias), which may have led to inconsistent genetic correlations between alcohol involvement and AUD, as well as paradoxically negative genetic correlations between alcohol involvement and psychiatric disorders/medical conditions.

Methods.

To explore these unexpected differences in genetic correlations, we conducted the first item-level and largest GWAS of AUDIT items (N=160,824), and applied a multivariate framework to mitigate previous biases.

Results.

We identified novel patterns of similarity (and dissimilarity) among the AUDIT items, and found evidence of a correlated two-factor structure at the genetic level (Consumption and Problems, r_g=.80). Moreover, by applying empirically-derived weights to each of the AUDIT items, we constructed an aggregate measure of alcohol consumption that is strongly associated with alcohol dependence (r_g=.67) and several other psychiatric disorders, and no longer positively associated with health and positive socioeconomic outcomes. Lastly, by conducting polygenic analyses in three independent cohorts that differed in their ascertainment and prevalence of AUD, we identified novel genetic associations between alcohol consumption, alcohol misuse, and human health.

Conclusions.

Our work further emphasizes the value of AUDIT for both clinical and genetic studies of AUD, and the importance of using multivariate methods to study genetic associations that are more closely related to AUD.

INTRODUCTION

Over the past decade, genome-wide association studies (GWASs) have advanced our understanding of alcohol use disorders (AUDs)(1). Many of these studies have relied on a categorical approach to AUD phenotypes, comparing clinically-ascertained cases and controls (e.g., 2), but recent studies have increasingly employed a complementary approach leveraging dimensional measures of alcohol consumption and screener-based AUD symptoms in population-based cohorts (e.g., 3–6). Often, these dimensional measures can more easily be administered at scale via self-report questionnaires than can clinical diagnostic measures, thereby accelerating genetic discovery through drastic increases in sample size. The Alcohol Use Disorders Identification Test (AUDIT)(7), a ten-item questionnaire that screens for drinking habits and problems by measuring aspects of alcohol use and misuse in the past year, is one such measure. A recent GWAS meta-analysis of AUD and AUDIT phenotypes identified 29 novel loci (5), representing one of the biggest advances of AUD genetics to date (2–4, 6).

Notably, several studies using self-report instruments have revealed that not all aspects of alcohol involvement are interchangeable. While AUDIT can be used as a unidimensional screener (i.e., AUDIT-Total), previous research has shown that AUDIT can differentiate between two related but distinct facets of AUD: alcohol consumption (sum of items 1–3, “AUDIT-C”), which is necessary but not sufficient for a diagnosis of AUD, and problematic consequences of alcohol consumption (sum of items 4–10, “AUDIT-P”), which more closely resemble the diagnostic criteria of AUD. We previously found that AUDIT-C and AUDIT-P have distinct genetic relationships with clinically-defined AUD (6), as well as other forms of psychopathology. Surprisingly, AUDIT-C was positively associated with socioeconomic variables, negatively associated with some forms of psychopathology, and only moderately positively associated with alcohol dependence, whereas AUDIT-P exhibited strong positive associations with alcohol dependence and numerous other psychiatric disorders. Although this divergence might reflect true differences in the biology underlying alcohol consumption versus problems, it may be confounded by other factors, such as sources of selection bias, genetic heterogeneity among the individual items, and measurement error (1, 8).

As AUDIT-C and AUDIT-P are computed using an unweighted composite score approach, they inherently rely on the assumptions that (i) the scale is unidimensional, and (ii) each item is equally informative of the construct being measured. This approach is not based on any empirical evidence but rather reflects a holdover from the original use of the AUDIT as a screener for primary health care settings. Therefore, it is possible that the lack of item-specific weights introduces error in downstream analyses. While these issues have been thoroughly studied at the phenotypic level via factor analysis (Table S1), they have not yet been investigated at the genetic level. Using methods that can account for, or mitigate, such measurement problems will allow researchers to better capitalize on the potential of dimensional measures like AUDIT for genetic discovery.

In the present study, we sought to elucidate the genetics of alcohol consumption and problematic consequences of alcohol use measured via AUDIT using Genomic Structural Equation Modeling (9), a novel multivariate framework that allows for structural equation modeling techniques to be applied to genetic covariance matrices based on GWAS results. Accordingly, we undertook the first item-level and largest to-date GWAS meta-analyses of AUDIT (N=160,824), using data from three population-based cohorts of European ancestry. We then used Genomic Structural Equation Modeling (9) to analyze the item-level GWAS results with the aims of (i) investigating the latent genetic factor structure of AUDIT, based on prior knowledge (Table S1), and (ii) conducting multivariate GWASs of the resulting latent genetic factor(s). We posited that applying this approach would lead to more nuanced, empirically-derived weights to each of the AUDIT items when constructing our aggregate measures (as opposed to giving each item equivalent weight), which is a novel approach for GWASs of AUD phenotypes. Finally, to characterize the biology and liability associated with each latent genetic factor, we used a variety of in-silico tools and polygenic analyses spanning three independent cohorts that varied in their method of ascertainment and prevalence of AUD.

We hypothesized that a higher resolution of each of the alcohol phenotypes measured in AUDIT would further our understanding of the differences among indices of alcohol consumption (items 1–3) and problematic alcohol use (items 4–10), and how they relate to human health. We anticipated that the genetic contributions to alcohol consumption and problematic use would not be completely overlapping, and that genomic modeling using item-level data would ameliorate the confounding issues between alcohol consumption, AUD, and indices of health that complicated previous GWAS efforts.

MATERIALS AND METHODS

Discovery samples and phenotype construction

We collected AUDIT (7) and genotype data from three population-based cohorts: UK Biobank (n_max=147,267), the Netherlands Twin Register (n_max=9,975), and the Avon Longitudinal Study of Parents and Children (ALSPAC, n_max=3,582). We used the same phenotyping strategies across the three cohorts, which are described in the Supplementary Material 2. AUDIT scores and demographics for each cohort are reported in Table S2. Genotyping, imputation and quality control procedures have been extensively described in previous publications (10–12). Because AUDIT was administered with skip logic in UK Biobank, we used multiple imputation by chained equations to minimize the impact of missing data on our item-level GWAS (see Supplementary Material 2.1 for details).

Univariate genome-wide association and meta-analyses

In UK Biobank, we used BOLT-LMM (13) v2.3.2 to conduct GWASs for each of the ten AUDIT items with the first 40 ancestry principal components, sex, age, sex-by-age interactions, and batch as covariates. In the Netherlands Twin Register, we used the fastgwa function of GCTA (14) and included the first 5 ancestry principal components, sex, birth year, and genotyping platform as covariates. In ALSPAC, we analyzed unrelated participants using PLINK v2.0 (15), including the first 10 ancestry principal components, sex, and age as covariates. Note that both BOLT-LMM and fastgwa are capable of analyzing related individuals. Further details are included in Supplementary Material 3, as well as prior work (16). We then used METAL (17) to conduct sample-size weighted meta-analyses of the cohort-level GWAS summary statistics for each AUDIT item following quality control procedures (see Supplementary Material 4). A total of 8,596,116 SNPs were included in the meta-analyses.

Phenotypic and genetic correlations

We used the lavaan (18) v0.6.5 package in R to estimate polychoric phenotypic correlations (r_p) among AUDIT items. We used the Genomic Structural Equation Modeling v0.0.2 package in R, which is based on LD score regression (19), to estimate the heritability of each of the ten AUDIT items, and the genetic correlations between them. We applied standard quality control procedures prior to all analyses (e.g., used precomputed LD scores, excluded the major histocompatibility region, SNPs restricted to HapMap 3, applied minor allele frequency ≥1% and information score >.90 filters). Lastly, we used Genomic Structural Equation Modeling (9) to estimate genetic correlations between latent genetic factors and complex traits and disorders broadly related to human health (Supplementary Material 5.1.2). We applied a standard Benjamini–Hochberg false discovery rate correction (FDR 5%) to account for multiple testing.

Phenotypic and genetic factor analysis

To empirically model the phenotypic and genetic relationships among AUDIT items, we used lavaan (18) and Genomic Structural Equation Modeling (9) to conduct phenotypic and genetic confirmatory factor analyses, respectively, using weighted least squares estimation. Further details are provided in the Supplementary Material 5.1 and described extensively elsewhere (20–23). We tested three models: (i) a parallel factor model (i.e., a sum score model), (ii) a common factor model, and (iii) a correlated factors model. The common and correlated-factors models were selected based on prior research (Table S1) while the parallel factor model served to test the restrictive assumptions of sum score approaches. We assessed model fit using conventional indices that were available in both the lavaan and Genomic Structural Equation Modeling software (9) (Supplementary Material 5). Only data from UK Biobank (the largest sample) was included in the phenotypic factor analyses. For the genetic factor analyses, GWAS summary statistics from the meta-analyses for each AUDIT item were subjected to standard quality control practices, as described above. Genomic Structural Equation Modeling’s multivariable version of LD score regression was then used to estimate the genetic covariance and sampling covariance matrices (S and V, respectively) for the AUDIT items, which were used to test the specified confirmatory factor models. The S matrix was smoothed beforehand as it was slightly non-positive definite. Factor extension analysis was used to estimate the expected factor loading of item 6 (i.e., ‘eye opener’; Supplementary Material 5.1.1), as it was excluded from the final genetic confirmatory factor model due to non-significant SNP heritability.

Multivariate genome-wide association analyses

Using Genomic Structural Equation Modeling (9), we conducted multivariate GWASs analyses by estimating SNP associations with the AUDIT latent genetic factors from the best-fitting model. The details of these analyses are described in the Supplementary Material 5.1. Individual SNP effects were estimated for the latent genetic factors in each model if they (i) were available in all univariate summary statistics, (ii) had a minor allele frequency ≥.5%, and (iii) were present in the 1000 Genomes Phase 3 v5 reference panel. The effective sample size for each latent factor was estimated using the approach described by Mallard and colleagues (16).

Biological annotation, gene and transcriptome-based association analyses

We performed multiple in-silico analyses to compare the results from each of the AUDIT latent genetic factors. First, we used FUMA (24) v1.2.8 to identify independent SNPs and study their functional consequences, which included ANNOVAR categories, Combined Annotation Dependent Depletion scores, RegulomeDB scores. Second, we used MAGMA v1.08 (24, 25) to conduct competitive gene-set and pathway analyses for each of the AUDIT genetic latent factors. SNPs were mapped to 18,546 protein-coding genes from Ensembl build 85. Gene-sets were obtained from Msigdb v7.0 (“Curated gene sets”, “GO terms”). We also used an extension of this method, Hi-C coupled MAGMA (H-MAGMA)(26), to assign non-coding (intergenic and intronic) SNPs to genes based on their chromatin interactions. Exonic and promoter SNPs are assigned to genes based on physical position. We used four Hi-C datasets, which were derived from fetal brain, adult brain, iPSC-derived neurons and and iPSC-derived astrocytes (https://github.com/thewonlab/H-MAGMA). Lastly, we used S-PrediXcan v0.6.2 (27) to predict gene expression levels in 13 brain tissues, and to test whether the predicted gene expression showed divergent correlation patterns with each of the AUDIT latent genetic factors. Pre-computed tissue weights from the Genotype-Tissue Expression (GTEx v8) project database (https://www.gtexportal.org/) were used as the reference transcriptome dataset. Further details are provided in the Supplementary Material 6.

Polygenic risk score analyses

Prediction of alcohol phenotypes in UK Biobank and COGA.

We used the PRS-CS “auto” version (28) to compute polygenic scores (PRSs) for the latent genetic AUDIT factors (Consumption and Problems) and their sum score counterparts (AUDIT-C and AUDIT-P) in European subjects from two independent samples: (i) an independent subset of unrelated individuals of European ancestry in the UK Biobank who did not fill out the AUDIT, and (ii) a subset of individuals of European ancestry from the Collaborative Study on the Genetics of Alcoholism (COGA)(29), which includes probands meeting criteria for alcohol dependence, their family members, and community control families. Using the ‘score’ algorithm in PLINK v1.90, we computed individual-level PRS to predict additional alcohol phenotypes (drinking quantity, drinking frequency, and lifetime AUD diagnosis) measured in UK Biobank and COGA (Supplementary Material 7). We tested for associations between AUDIT PRSs and alcohol phenotypes using linear (quantity and frequency phenotypes) or logistic (AUD) regression models in R v3.6.3. In UK Biobank, we included sex, age at first assessment, Townsend Deprivation Index score (30) and the first 10 ancestry principal components as covariates. In COGA, we included age, sex, array type, income, and the first 10 ancestry principal components as fixed effect covariates, with family identity included as a random effect (i.e., allowing the intercept to vary by family).

We sought to compare the performance of the latent factor-based PRSs (Consumption and Problems PRSs) against the performance of their sum score counterparts (AUDIT-C and AUDIT-P PRSs) in predicting different alcohol phenotypes. To this end, we applied two approaches to our PRS analyses: (i) cross-dimension PRS models (i.e., Consumption + Problems PRSs included as simultaneous predictors), and (ii) cross-method PRS models (i.e., Consumption + AUDIT-C PRSs included as simultaneous predictors in a model, and Problems + AUDIT-P PRSs included as simultaneous predictors in a model). We corrected for the total number of outcome phenotypes across the validation samples using a conservative Bonferroni p value = 8.33E-3, since the same PRSs were used as predictors across models (and were correlated with each other).

Phenome-wide association study in BioVU.

To examine exploratory associations between PRSs and hundreds of medical diagnoses, we used the PRS-CS method (28) described above to compute Consumption and Problems PRSs for each of the 66,915 unrelated genotyped individuals of European ancestry from the Vanderbilt University Medical Center biobank (BioVU)(31). Using electronic health record data in BioVU, we performed phenome-wide association studies (PheWASs) for Consumption and Problems PRSs using the PheWAS (32) v0.12 package in R. Specifically, we fit a logistic regression model to each of the 1,335 case/control phenotypes in BioVU (“phecodes”; Supplementary Material 7.3) in order to estimate the effect of a given PRS on each diagnosis. Sex, median age of the longitudinal electronic health record measurements, and the first 10 principal components were included as covariates. We then repeated the PheWAS analyses using AUD diagnoses (phecodes 317, 317.1) as additional covariates. A standard Benjamini–Hochberg false discovery rate (FDR 5%) correction was applied to account for multiple testing.

RESULTS

Phenotypic and genetic analyses reveal a consistent two-factor structure of alcohol consumption and problematic use

Phenotypic and genetic analyses showed that AUDIT items were positively correlated with each other, with correlation estimates ranging from moderate to large (Tables S3–4). The one exception to this pattern was item 1 (i.e., frequency of consumption), which was generally less correlated with the other AUDIT items. Moreover, we found that genetic correlations tended to be moderately larger than the phenotypic correlations (mean absolute difference = .198), an effect that was driven by stronger genetic correlations among items 4 through 10 (i.e., the problematic alcohol use phenotypes). Of note, all AUDIT items exhibited significant SNP heritability with the exception of item 6 (Table S5). We suspect this may be attributable to the low rates of endorsement for the item in all three cohorts (Table S2). For this reason, we excluded item 6 from all subsequent analyses, and a factor extension analysis was used to estimate its expected factor loading in the final model.

We found that a correlated factors model provided the best fit (Figure 1, Tables S6–7) to both the genetic and the phenotypic covariance matrices [phenotypic model: (χ²(26)=4252.963, Comparative Fit Index=.994, standardized root mean square residual=.041), genetic model: (χ²(26)=142.689, Comparative Fit Index=.982, standardized root mean square residual=.067)]. That is, the patterns of genetic and phenotypic correlations among the AUDIT items could both be represented by a factor model with two correlated factors: one that captured the covariance among alcohol consumption items (items 1–3, henceforth “Consumption”) and one that captured the covariance among alcohol-related problems (items 4–10, henceforth “Problems”). These two latent factors were highly correlated with each other, phenotypically (r_p=.825, SE=.002) and genetically (r_g=.801, SE=.037). Nearly all items had large factor loadings across both levels of analyses except item 1, which consistently had markedly smaller factor loadings and larger residual variances.

The two correlated-factors model was compared to other solutions. A model with a single common factor provided acceptable fit for the phenotypic (χ²(27)=14967.064, Comparative Fit Index=.978, standardized root mean square residual=.070) and genetic (χ²(27)=350.785, Comparative Fit Index=.949, standardized root mean square residual=.094) factor analyses, but it did not minimize the standardized difference between the observed and predicted correlations as well as the correlated factors model (Table S7). The parallel factor model (i.e., the sum score model) exhibited very poor fit, reflected by the strong, unanimous bias observed in the model implied correlations [phenotypic model: (χ²(34)=43655.530, Comparative Fit Index=.936, standardized root mean square residual=.143), genetic model: (χ²(43)=607.196, Comparative Fit Index=.911, standardized root mean square residual=.470)]. Accordingly, we identified the correlated factors model as the best fitting and most appropriate model for further genetic analyses.

Latent variable approach characterizes and ameliorates bias in GWAS of alcohol consumption

By estimating genetic correlations in a Genomic Structural Equation Modeling framework, we identified interesting patterns of relationships between 100 exogenous phenotypes (chosen based on previous findings or hypothesized relationships) and the Consumption and Problems latent genetic factors. We also examined correlations with the residual genetic variance in item 1 (i.e., the genetic variance in item 1 that is unrelated to other AUDIT items; henceforth “Frequency Residual”). Results are reported in Table S8.

For Consumption and Problems, we found that their patterns of genetic correlation with other phenotypes were much more similar than previously reported for AUDIT-C and AUDIT-P (4). Both Consumption and Problems showed strong positive genetic correlations with alcohol dependence. Consumption and Problems were also positively related to other measures of substance use (e.g. cannabis use disorder, impulsivity). Furthermore, the previous positive associations that we observed between AUDIT and indices of socioeconomic status (e.g. educational attainment) were now attenuated.

We did still observe that, compared to Consumption, Problems was more strongly related to psychopathology (e.g., post-traumatic stress disorder, depression, bipolar disorder, schizophrenia). We also identified novel divergent associations with pain phenotypes, malnutrition and measures of social satisfaction (e.g., Problems showing genetic overlap with these conditions) suggesting that, as we anticipated, the genetic contributions to alcohol consumption and misuse reflect both complementary and distinct genetic factors.

Finally, Frequency Residual was negatively associated with alcohol dependence (Figure 2). We also found positive genetic correlations between Frequency Residual and socioeconomic outcomes, including educational attainment, household income, and intelligence. Furthermore, we observed consistently negative genetic correlations between Frequency Residual and other psychiatric and substance use disorders, such as major depressive disorder and cannabis use disorder. This result suggests that many of the puzzling genetic correlations previously reported for alcohol consumption were driven by variance related to socially-stratified differences in behavior rather than variance related to the alcohol phenotypes of clinical interest.

Multivariate GWAS confirm a distinct genetic basis between alcohol consumption and misuse

The results of our multivariate GWAS for Consumption and Problems are presented in Figure 3. We identified 8 independent loci that were associated with Consumption (Table S9). For Problems, we replicated 2 loci on chromosome 4, located in the ethanol metabolizing gene ADH1B (Table S10). The signal associated with the latent factors is convergent with that of the sum scores, with a few exceptions (Supplementary Material 6.1.1 and Tables S11–12).

Some loci included genes that were only associated with Consumption (Table S31), such as KLB, RCF1 and the MAPT/CRHR1 region, which were previously associated with alcohol consumption behaviors (3–5, 33), and other novel candidate genes for alcohol, such as CPS1, previously associated with metabolic conditions (Table S13).

We performed in-silico gene-based and transcriptome-based analyses (Tables S15–30), which consistently revealed both convergent and divergent associations for Consumption and Problems (Table S31). For example, both factors robustly implicated ethanol metabolizing genes (ADH1B, ADH1C) and dopamine transmission [DRD2, involved in mediating the rewarding effects of drugs (34)], as well as pleiotropic genes previously implicated in anthropometric and metabolic traits [e.g., CELF1 (5, 35)], and intelligence [e.g., MTCH2 (36), FAM180B/NDUFS3 (37)].

Lastly, gene-set analyses revealed that genes more closely linked to cellular responses to alcohol drinking (e.g., cellular response to retinoic acid) were associated with Consumption (Table S17), while the gene-sets related to postsynaptic modulation of chemical synaptic transmission were associated with Problems (Table S18).

Polygenic risk analyses

UK Biobank.

In UK Biobank, we found that both Consumption and Problems PRSs were robustly associated with drinking frequency, drinking quantity, and lifetime AUD (Figure 4). However, Consumption PRS outperformed (i.e., explained more variance) Problems PRS for alcohol consumption phenotypes (Table S32). When the latent factor PRSs and sum score PRSs for the same construct were both included in the multiple regression model (e.g., Consumption and AUDIT-C PRSs), Consumption PRS outperformed AUDIT-C PRS in predicting AUD diagnosis and drinking quantity (but not frequency), while AUDIT-P PRS outperformed Problems PRS across all three phenotypes (Table S33).

Figure 4. — Associations between Consumption and Problems PRS and selected alcohol-related phenotypes. Bar charts of the variance explained by *Consumption* and *Problems* PRS for various clinical and quantitative measures of alcohol use. Values correspond to the proportion of variance explained the outcome (R2 or pseudo R2 depending on the use of linear or logistic regression; see Supplementary Section 7 for more details). Results for the independent UK Biobank subsample are presented on the left, while results for the independent COGA cohort are presented on the right. Please note that the COGA models are not directly comparable to those from the UKB models, as mixed-effect models were used in COGA. Please also note that the R² for each PRS is calculated from a single PRS model and, as such, the values not independent (due to shared variance between PRSs). Complete results are available in Tables S32–S35.

COGA.

In COGA, PRS results aligned with those observed in UK Biobank, with a few exceptions. When both Consumption and Problems PRS were included in the same model, only Consumption PRS showed significant associations with drinks per week, MaxDrinks, and AUD (Table S34). As observed in UK Biobank, when latent factor PRSs and sum score PRSs for the same construct were both included in the multiple regression model, Consumption outperformed AUDIT-C PRS, whereas AUDIT-P PRS outperformed Problems (Table S35). Interestingly, in those models, we found that the strongest associations were between Consumption PRS and AUD, and AUDIT-P PRS and AUD.

BioVU.

We performed two independent PheWASs of Consumption and Problems PRSs to identify whether these two factors would show different patterns of genetic associations with medical outcomes. Of 1,335 phenotypes, 15 were FDR-significantly associated with Consumption (Figure 5, Table S36) and 17 with Problems (Table S37). Both factors were significantly associated with AUD and other tobacco and substance use disorders. Replicating our previous results for AUDIT-C and AUDIT-P, we observed paradoxical negative associations between Consumption and metabolic conditions, including diabetes mellitus and obesity phenotypes, whereas Problems was primarily positively associated with other psychiatric disorders, including depression, anxiety disorder, bipolar disorder, schizophrenia and suicidal ideation or attempt. Intriguingly, Problems was also negatively associated with type 2 diabetes with renal manifestations. Most of the associations did not persist after correcting for AUD, although the direction of effects remained consistent (Tables S38–39).

Figure 5. — Phenome-wide association study of polygenic risk scores for *Consumption* (left panel) and *Problems* (right panel) against 1,338 diseases available in the biobank from Vanderbilt University Medical Center, BioVU. PheWAS of both *Consumption* and *Problems* revealed positive genetic associations with alcohol use disorders. *Problems* was positively genetically associated with multiple psychiatric conditions, whereas *Consumption* was counterintuitively negatively associated with metabolic conditions. Importantly, most of the associations disappear when adjusting for alcohol use disorders diagnosis (non-significant associations are highlighted in gray).

DISCUSSION

In the present study, we report the first item-level and largest GWAS of AUDIT to date (N=160,824), and we used Genomic Structural Equation Modeling to elucidate the genetic etiology of alcohol consumption and problematic alcohol use. By conducting phenotypic and genetic factor analyses of the individual AUDIT items, we provide evidence that two correlated latent factors (Consumption and Problems) parsimoniously explained the covariance in measures of alcohol consumption and problematic alcohol use across both levels of analysis. Moreover, by applying empirically-derived weights to the AUDIT items in a Genomic Structural Equation Modeling framework, we demonstrated that our method can ameliorate confounding biases that have complicated previous work with consumption phenotypes (in particular, the bias present in item 1). Notably, both Consumption and Problems share a strong, positive genetic correlation with alcohol dependence (both r_g~0.7), and we show, for the first time, that the polygenic signal of Consumption is strongly associated with several AUD phenotypes in three independent cohorts. Finally, the results of our bioinformatic analyses further illustrate that Consumption and Problems have unique components of their genetic etiology. Collectively, our novel framework provides a means to study two genetic liabilities that are more closely related to AUD, and advances our understanding of the associated biology in several ways, as we delineate below.

First, we built upon recent investigations of the genetic etiology of AUDs and related traits by analyzing each of the ten unique items that comprise AUDIT. At this higher resolution, we were able to identify sources of genetic heterogeneity among the items, such as the consistently weaker genetic correlations between frequency of alcohol consumption (item 1) and other drinking patterns (items 2–3) and AUD symptoms (items 4–10). Our item-level approach also allowed us to empirically model the genetic relationships between AUDIT items, providing the first empirical evidence of a correlated, two-factor structure for AUD symptoms at the genetic level. In doing so, we also generated empirically-derived weights to determine how individual items contribute to aggregate measures of alcohol consumption and problematic use. This is an important advance from most quantitative or dimensional genetic studies of AUDs (and other forms of psychopathology), which often use composite score measures that lack statistical justification.

Second, and perhaps most importantly, we found that Consumption was a good genetic proxy of AUD when appropriate weights were applied to the individual items using Genomic Structural Equation Modeling. This is a striking change from previous investigations into the divergent genetic bases of alcohol consumption and problematic use, including our own prior analyses of AUDIT. GWASs of alcohol consumption phenotypes have consistently reported low-to-moderate overlap with AUDs that have surprised many researchers (2–5), and even paradoxical negative associations with a variety of diseases and disorders. Our multivariate approach has ameliorated these issues, producing an aggregate measure of alcohol consumption that is more consistent with the known patterns of alcohol phenotype associations established in the existing body of literature, such as a strong genetic correlation with alcohol dependence. Furthermore, we used genetic correlation analyses to characterize the residual genetic variance in frequency of consumption (Frequency Residual) that is unrelated to other AUDIT items. These analyses revealed that Frequency Residual had consistently positive associations with measures of socioeconomic status and consistently negative associations with measures of substance use and psychopathology. Indeed, these genetic correlations are very similar to those observed in GWASs of AUDIT-C (4, 5) and other GWASs of alcohol consumption (3, 4), suggesting that single-item frequency-based measures of alcohol consumption may be particularly susceptible to confounding and/or selection bias. For example, Marees and colleagues (38) reported that greater frequency of alcohol consumption was associated with higher socioeconomic status and lower risk of other psychiatric and substance use disorders in UK Biobank. In population-based cohorts with a “healthy volunteer” bias, such as the UK Biobank, the relationship between frequency of alcohol consumption and aspects of physical and mental health may not be fully generalizable (39). This degree of bias, we speculate, will likely vary from population to population.

Third, we confirmed that the genetic contributions to alcohol consumption are partially distinct from those pertaining to problematic consequences of alcohol use. In-silico analyses revealed the value of dissecting the two phenotypes, as gene- and transcriptome-based analyses identified partially divergent biological mechanisms for Consumption and Problems. For example, the corticotrophin receptor gene (CRHR1), which has been associated with alcohol use in animals and humans (40, 41), was associated with Consumption only. As a result, we are now beginning to uncover genetic signals for aspects of alcohol involvement that have the potential to be further analyzed at the molecular, cellular and circuit level in cellular and animal model systems.

Fourth, we found that Consumption PRS was strongly associated with AUD even in higher-risk cohorts like COGA. This demonstrates the important downstream effects of allowing items to have different weights in phenotype construction. Whereas our current and previous PRS for AUDIT-C have been disproportionately influenced by a single item (frequency of consumption)(42), our Consumption PRS was composed of the genetic effects shared among all consumption-focused items. The Consumption and Problems PRSs were both strongly associated with AUD in UK Biobank – even when both scores were entered in the same model. In COGA, both Consumption and Problems PRSs were associated with AUD, but Consumption PRS was more strongly associated than Problems PRS. The increased influence of binge drinking (item 3), which had a large factor loading on Consumption, may be partially responsible for these stronger associations in a high-risk sample. However, it is perhaps more likely that these differences might be simply explained by differences in item endorsement and thus predictive power of the discovery GWASs (e.g., Consumption had a greater mean χ² than Problems).

Finally, our comprehensive PheWAS analyses have linked different facets of AUD liability (via the latent factor-based Consumption and Problems PRSs) to a myriad of health-related outcomes in a large, independent biobank. We found that the Consumption PRS was consistently negatively associated with a broad range of metabolic and congenital conditions. While it is possible that there is still residual bias in the discovery GWAS, it is important to note that this pattern of paradoxical associations with Consumption is not observed in the genetic correlation analyses. Thus, it is possible that these negative associations are illustrative of selection bias or other confounding in BioVU (43), where patients with certain conditions may elect to not drink due to unmeasured factors (e.g., family history, medical advice, contraindications for prescriptions). Mirroring the genetic correlation results, we also found that the Problems PRS was uniquely associated with numerous psychiatric disorders that are commonly reported to co-occur with AUD. However, and importantly, we identified that the associations between Problems PRS and mental health did not persist in the absence of the clinical manifestation of AUD. These findings suggest that the associations with mental health are not the result of horizontal pleiotropy. Instead, they may be either (i) a consequence of AUD, (ii) correlated with other risk factors for AUD (along and/or aside from genetic risk), or (iii) related to ascertainment of patients with diagnosed AUD in the medical record. These results also encouragingly suggest that treating AUD could have widespread improvements in overall health.

These findings should be interpreted in light of several limitations. Regretfully, AUDIT is a self-report that can be influenced by misreporting, and it only captures alcohol use in the past year, so can be influenced by longitudinal changes in drinking that may be a consequence, for example, of other illnesses (44). People who stopped drinking or never drinkers might represent genetically distinct groups; in our dataset 4,511 individuals were never drinkers, and 4,290 were previous drinkers. While our approach has substantially reduced bias in AUDIT without excluding any individuals from discovery, future studies might consider employing multiple techniques (e.g., separate never drinkers from former drinkers) to further alleviate potential biases associated with frequency of alcohol use in population-based cohorts. Additionally, while the AUDIT PRSs tended to perform similarly in UK Biobank and COGA, the portability of PRSs can be influenced by demographic characteristics such as the socio-economic status, age or sex (45). It remains to be determined how generalizable the genetics of AUDIT are across different populations, especially in samples of different ancestries (as we have only included individuals of European ancestry in the present study) or cultures (UK vs US). A similar point also applies to sex-stratified samples, considering that AUDIT scores differ in men and women. Finally, it is important to note that the Problems PRS exhibited weaker associations with AUD and other alcohol phenotypes in comparison to its AUDIT-P counterpart. Although the two predictors generally had similar effects in single PRS models, the Problems PRS was rendered redundant in the cross-method analyses when both of the highly correlated AUDIT-P and Problems PRSs (e.g., r = .84 in UK Biobank) were included in the regression models. However, we caution against the interpretation that the univariate GWAS approach is preferable. The multivariate GWAS function of Genomic Structural Equation Modeling is not only more flexible than traditional univariate GWAS, but its results may also be more robust to confounding, as the software automatically applies a correction for population stratification (20). Furthermore, Genomic Structural Equation Modeling is better suited to investigate nuanced genetic influences, including the possibility to identify SNPs with heterogeneous effects across symptoms or items.

Analyzing alternative phenotypes as a complementary approach to studying clinically-defined AUD, and psychiatric disorders in general, has generated considerable interest in recent years (46). Collectively, our work demonstrates how AUDIT can inexpensively facilitate such efforts. Here, we have shown that, after correcting for some potential biases, item- or symptom-level analyses can help unpack the genetic etiology of AUD by breaking down genetic influences into specific and shared components; notably, this is only possible because we can contrast our results against gold standard, clinically-ascertained, AUD GWAS datasets. While composite scores have shown some utility in previous genetic association studies, such studies often rely on strong assumptions that the scale is unidimensional, and that each item is equally informative of the construct being measured. In the present paper, we have shown that the latter assumption is false for the AUDIT. In particular, a large proportion of the genetic variance of item 1 appears to be uninformative of a broader consumption construct, as it covaries with. Moreover, although we found a conspicuous degree of unidimensionality among the AUDIT items, our results demonstrate that Consumption and Problems remain distinct in their associations with human health.

Supplementary Material

NIHMS1814273-supplement-Supplementary_Material.docx^{(136.5KB, docx)}

Supplementary Tables

NIHMS1814273-supplement-Supplementary_Tables.xlsx^{(51.6MB, xlsx)}

ACKNOWLEDGEMENTS

This research was conducted using the UK Biobank Resource under Application Numbers 11425 and 16406.

The AUDIT data collection in ALSPAC was funded by NIH AA018333; ACE was supported by a NIAAA grant (R01AA027522).

The Collaborative Study on the Genetics of Alcoholism (COGA), Principal Investigators B. Porjesz, V. Hesselbrock, T. Foroud; Scientific Director, A. Agrawal; Translational Director, D. Dick, includes eleven different centers: University of Connecticut (V. Hesselbrock); Indiana University (H.J. Edenberg, T. Foroud, J. Nurnberger Jr., Y. Liu); University of Iowa (S. Kuperman, J. Kramer); SUNY Downstate (B. Porjesz, J. Meyers, C. Kamarajan, A. Pandey); Washington University in St. Louis (L. Bierut, J. Rice, K. Bucholz, A. Agrawal); University of California at San Diego (M. Schuckit); Rutgers University (J. Tischfield, A. Brooks, R. Hart); The Children’s Hospital of Philadelphia, University of Pennsylvania (L. Almasy); Virginia Commonwealth University (D. Dick, J. Salvatore); Icahn School of Medicine at Mount Sinai (A. Goate, M. Kapoor, P. Slesinger); and Howard University (D. Scott). Other COGA collaborators include: L. Bauer (University of Connecticut); L. Wetherill, X. Xuei, D. Lai, S. O’Connor, M. Plawecki, S. Lourens (Indiana University); L. Acion (University of Iowa); G. Chan (University of Iowa; University of Connecticut); D.B. Chorlian, J. Zhang, S. Kinreich, G. Pandey (SUNY Downstate); M. Chao (Icahn School of Medicine at Mount Sinai); A. Anokhin, V. McCutcheon, S. Saccone (Washington University); F. Aliev, P. Barr (Virginia Commonwealth University); H. Chin and A. Parsian are the NIAAA Staff Collaborators. We continue to be inspired by our memories of Henri Begleiter and Theodore Reich, founding PI and Co-PI of COGA, and also owe a debt of gratitude to other past organizers of COGA, including Ting- Kai Li, P. Michael Conneally, Raymond Crowe, and Wendy Reich, for their critical contributions. This national collaborative study is supported by NIH Grant U10AA008401 from the National Institute on Alcohol Abuse and Alcoholism (NIAAA) and the National Institute on Drug Abuse (NIDA). ECJ was supported by funding from NIAAA (F32AA027435); AA was supported by funding from K02 DA32573, MH109532, U10AA008401 grants; HJE and JRK were supported by NIAAA U10AA008401.

For the Netherland Twin Register, funding was obtained from the Netherlands Organization for Scientific Research (NWO) and The Netherlands Organisation for Health Research and Development (ZonMW) grants 904-61-090, 985-10-002, 912-10-020, 904-61-193,480-04-004, 463-06-001, 451-04-034, 400-05-717, Addiction-31160008, 016-115-035, 481-08-011, 400-07-080, 056-32-010, Middelgroot-911-09-032, OCW_NWO Gravity program ‒024.001.003, NWO-Groot 480-15-001/674, Center for Medical Systems Biology (CSMB, NWO Genomics), NBIC/BioAssist/RK(2008.024), Biobanking and Biomolecular Resources Research Infrastructure (BBMRI -NL, 184.021.007 and 184.033.111), X-Omics 184-034-019; Spinozapremie (NWO- 56-464-14192), KNAW Academy Professor Award (PAH/6635) and University Research Fellow grant (URF) to DIB; Amsterdam Public Health research institute (former EMGO+), Neuroscience Amsterdam research institute (former NCA); the European Community’s Fifth and Seventh Framework Program (FP5-LIFE QUALITY-CT-2002-2006, FP7-HEALTH-F4-2007-2013, grant 01254: GenomEUtwin, grant 01413: ENGAGE and grant 602768: ACTION); the European Research Council (ERC Starting 284167, ERC Consolidator 771057, ERC Advanced 230374), Rutgers University Cell and DNA Repository (NIMH U24 MH068457-06), the National Institutes of Health (NIH, R01D0042157-01A1, R01MH58799-03, MH081802, DA018673, R01 DK092127-04, Grand Opportunity grants 1RC2 MH089951, and 1RC2 MH089995); the Avera Institute for Human Genetics, Sioux Falls, South Dakota (USA). Part of the genotyping and analyses were funded by the Genetic Association Information Network (GAIN) of the Foundation for the National Institutes of Health. Computing was supported by NWO through grant 2018/EW/00408559, BiG Grid, the Dutch e-Science Grid and SURFSARA. MGN is supported by R01MH120219, ZonMW grants 849200011 and 531003014 from The Netherlands Organisation for Health Research and Development, a VENI grant awarded by NWO (VI.Veni.191G.030) and is a Jacobs Foundation Fellow.

PRS analyses using UK Biobank data were carried out on the Genetic Cluster Computer hosted by the Dutch National computing and Networking Services SurfSARA.

The dataset(s) used for the PheWAS analyses described were obtained from Vanderbilt University Medical Center’s BioVU which is supported by numerous sources: institutional funding, private agencies, and federal grants. These include the NIH funded Shared Instrumentation Grant S10RR025141; and CTSA grants UL1TR002243, UL1TR000445, and UL1RR024975. Genomic data are also supported by investigator-led projects that include U01HG004798, R01NS032830, RC2GM092618, P50GM115305, U01HG006378, U19HL065962, R01HD074711; and additional funding sources listed at https://victr.vumc.org/biovu-funding/. LKD obtained support from 1R01MH113362, 1R01MH118233 and 1R56MH120736.

SSR was supported by a NARSAD Young Investigator Award from the Brain and Behavior Foundation (Grant Number 27676). YH, MVJ, SSR and AAP were supported by funds from the California Tobacco-Related Disease Research Program (TRDRP; Grant Number 28IR-0070 and T29KT0526).

DATA AVAILABILITY

The GWAS summary statistics for each latent AUDIT factor, and the sum score counterparts (AUDIT-C and AUDIT-P), will be made available on the PGC website.

REFERENCES

1.Sanchez-Roige S, Palmer AA, Clarke T-K: Recent Efforts to Dissect the Genetic Basis of Alcohol Use and Abuse. Biol Psychiatry 2020; 87:609–618 [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Walters RK, Polimanti R, Johnson EC, et al. : Transancestral GWAS of alcohol dependence reveals common genetic underpinnings with psychiatric disorders. Nat Neurosci 2018; 21:1656–1669 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Liu M, Jiang Y, Wedow R, et al. : Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat Genet 2019; 51:237–244 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Kranzler HR, Zhou H, Kember RL, et al. : Genome-wide association study of alcohol consumption and use disorder in 274,424 individuals from multiple populations. Nat Commun 2019; 10:1499. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Zhou H, Sealock JM, Sanchez-Roige S, et al. : Meta-analysis of problematic alcohol use in 435,563 individuals identifies 29 risk variants and yields insights into biology, pleiotropy and causality. Nat Neurosci 2019; 738088 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Sanchez-Roige S, Palmer AA, Fontanillas P, et al. : Genome-Wide Association Study Meta-Analysis of the Alcohol Use Disorders Identification Test (AUDIT) in Two Population-Based Cohorts. Am J Psychiatry 2019; 176:107–118 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Saunders JB, Aasland OG, Babor TF, et al. : Development of the Alcohol Use Disorders Identification Test (AUDIT): WHO Collaborative Project on Early Detection of Persons with Harmful Alcohol Consumption--II. Addict Abingdon Engl 1993; 88:791–804 [DOI] [PubMed] [Google Scholar]
8.Litten RZ, Ryan ML, Falk DE, et al. : Heterogeneity of Alcohol Use Disorder: Understanding Mechanisms to Advance Personalized Treatment. Alcohol Clin Exp Res 2015; 39:579–584 [DOI] [PubMed] [Google Scholar]
9.Grotzinger AD, Rhemtulla M, de Vlaming R, et al. : Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat Hum Behav 2019; 3:513–525 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Boyd A, Golding J, Macleod J, et al. : Cohort Profile: The ‘Children of the 90s’—the index offspring of the Avon Longitudinal Study of Parents and Children. Int J Epidemiol 2013; 42:111–127 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Boomsma DI, Vink JM, Beijsterveldt TCEM van, et al.: Netherlands Twin Register: A Focus on Longitudinal Research. Twin Res Hum Genet 2002; 5:401–406 [DOI] [PubMed] [Google Scholar]
12.Bycroft C, Freeman C, Petkova D, et al. : The UK Biobank resource with deep phenotyping and genomic data. Nature 2018; 562:203–209 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Loh P-R, Tucker G, Bulik-Sullivan BK, et al. : Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet 2015; 47:284–290 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Jiang L, Zheng Z, Qi T, et al. : A resource-efficient tool for mixed model association analysis of large-scale data. Nat Genet 2019; 51:1749–1755 [DOI] [PubMed] [Google Scholar]
15.Chang CC, Chow CC, Tellier LC, et al. : Second-generation PLINK: rising to the challenge of larger and richer datasets [Internet]. GigaScience 2015; 4[cited 2020 Jul 23] Available from: https://academic.oup.com/gigascience/article/4/1/s13742-015-0047-8/2707533 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Mallard TT, Linnér RK, Grotzinger AD, et al. : Multivariate GWAS of psychiatric disorders and their cardinal symptoms reveal two dimensions of cross-cutting genetic liabilities. bioRxiv 2020; 603134 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Willer CJ, Li Y, Abecasis GR: METAL: fast and efficient meta-analysis of genomewide association scans. Bioinforma Oxf Engl 2010; 26:2190–2191 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Rosseel Y: lavaan: An R Package for Structural Equation Modeling. J Stat Softw 2012; 48:1–36 [Google Scholar]
19.Bulik-Sullivan BK, Loh P-R, Finucane HK, et al. : LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet 2015; 47:291–295 [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Grotzinger AD, Rhemtulla M, de Vlaming R, et al. : Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat Hum Behav 2019; 3:513–525 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Linnér RK, Mallard TT, Barr PB, et al. : Multivariate genomic analysis of 1.5 million people identifies genes related to addiction, antisocial behavior, and health [Internet]. Genetics, 2020[cited 2020 Dec 14] Available from: http://biorxiv.org/lookup/doi/10.1101/2020.10.16.342501 [Google Scholar]
22.Mallard TT, Linnér RK, Grotzinger AD, et al. : Multivariate GWAS of psychiatric disorders and their cardinal symptoms reveal two dimensions of cross-cutting genetic liabilities [Internet]. Genetics, 2019[cited 2020 Dec 14] Available from: http://biorxiv.org/lookup/doi/10.1101/603134 [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Cross-Disorder Group of the Psychiatric Genomics Consortium. Electronic address: plee0@mgh.harvard.edu, Cross-Disorder Group of the Psychiatric Genomics Consortium: Genomic Relationships, Novel Loci, and Pleiotropic Mechanisms across Eight Psychiatric Disorders. Cell 2019; 179:1469–1482.e11 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Watanabe K, Taskesen E, van Bochoven A, et al. : Functional mapping and annotation of genetic associations with FUMA. Nat Commun 2017; 8:1826. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.de Leeuw CA, Mooij JM, Heskes T, et al. : MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput Biol 2015; 11:e1004219. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Sey NYA, Hu B, Mah W, et al. : A computational tool (H-MAGMA) for improved prediction of brain-disorder risk genes by incorporating brain chromatin interaction profiles. Nat Neurosci 2020; 23:583–593 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Consortium GTEx, Barbeira AN Dickinson SP, et al. : Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat Commun 2018; 9:1825. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Ge T, Chen C-Y, Ni Y, et al. : Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun 2019; 10:1776. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Begleiter: The Collaborative Study on the Genetics of Alcoholism. Alcohol Health Res World 1995; 19:228–236 [PMC free article] [PubMed] [Google Scholar]
30.Messer LC, Laraia BA, Kaufman JS, et al. : The Development of a Standardized Neighborhood Deprivation Index. J Urban Health 2006; 83:1041–1062 [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Dennis J, Sealock J, Levinson RT, et al. : Genetic risk for major depressive disorder and loneliness in gender-specific associations with coronary artery disease: supplementary [Internet]. Genetics, 2019[cited 2019 Oct 8] Available from: http://biorxiv.org/lookup/doi/10.1101/512541 [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Carroll RJ, Bastarache L, Denny JC: R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinformatics 2014; 30:2375–2376 [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Evangelou E, Gao H, Chu C, et al. : New alcohol-related genes suggest shared genetic mechanisms with neuropsychiatric disorders. Nat Hum Behav 2019; 3:950–961 [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Volkow ND, Morales M: The Brain on Drugs: From Reward to Addiction. Cell 2015; 162:712–725 [DOI] [PubMed] [Google Scholar]
35.Hinney A, Albayrak O, Antel J, et al. : Genetic variation at the CELF1 (CUGBP, elav-like family member 1 gene) locus is genome-wide associated with Alzheimer’s disease and obesity. Am J Med Genet Part B Neuropsychiatr Genet Off Publ Int Soc Psychiatr Genet 2014; 165B:283–293 [DOI] [PubMed] [Google Scholar]
36.Davies G, Marioni RE, Liewald DC, et al. : Genome-wide association study of cognitive functions and educational attainment in UK Biobank ( N =112 151). Mol Psychiatry 2016; 21:758–767 [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Savage JE, Jansen PR, Stringer S, et al. : Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nat Genet 2018; 50:912–919 [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Marees AT, Smit DJA, Ong J-S, et al. : Potential influence of socioeconomic status on genetic correlations between alcohol consumption measures and mental health. Psychol Med 2020; 50:484–498 [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Fry A, Littlejohns TJ, Sudlow C, et al. : Comparison of Sociodemographic and Health-Related Characteristics of UK Biobank Participants With Those of the General Population. Am J Epidemiol 2017; 186:1026–1034 [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Gelernter J, Sun N, Polimanti R, et al. : Genome-wide Association Study of Maximum Habitual Alcohol Intake in >140,000 U.S. European and African American Veterans Yields Novel Risk Loci. Biol Psychiatry 2019; 86:365–376 [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Zorrilla EP, Logrip ML, Koob GF: Corticotropin releasing factor: A key role in the neurobiology of addiction. Front Neuroendocrinol 2014; 35:234–244 [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Johnson EC, Sanchez-Roige S, Acion L, et al. : Polygenic contributions to alcohol use and alcohol use disorders across population-based and clinically ascertained samples. Psychol Med 2020; 1–10 [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Munafò MR, Tilling K, Taylor AE, et al. : Collider scope: when selection bias can substantially influence observed associations. Int J Epidemiol 2018; 47:226–235 [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Xue A, Jiang L, Zhu Z, et al. : Genome-wide analyses of behavioural traits biased by misreports and longitudinal changes [Internet]. Genetic and Genomic Medicine, 2020[cited 2020 Jul 17] Available from: http://medrxiv.org/lookup/doi/10.1101/2020.06.15.20131284 [Google Scholar]
45.Mostafavi H, Harpak A, Agarwal I, et al. : Variable prediction accuracy of polygenic scores within an ancestry group. eLife 2020; 9:e48376 [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Sanchez-Roige S, Palmer AA: Emerging phenotyping strategies will advance our understanding of psychiatric genetics. Nat Neurosci 2020; 23:475–480 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material

NIHMS1814273-supplement-Supplementary_Material.docx^{(136.5KB, docx)}

Supplementary Tables

NIHMS1814273-supplement-Supplementary_Tables.xlsx^{(51.6MB, xlsx)}

Data Availability Statement

The GWAS summary statistics for each latent AUDIT factor, and the sum score counterparts (AUDIT-C and AUDIT-P), will be made available on the PGC website.

[R1] 1.Sanchez-Roige S, Palmer AA, Clarke T-K: Recent Efforts to Dissect the Genetic Basis of Alcohol Use and Abuse. Biol Psychiatry 2020; 87:609–618 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Walters RK, Polimanti R, Johnson EC, et al. : Transancestral GWAS of alcohol dependence reveals common genetic underpinnings with psychiatric disorders. Nat Neurosci 2018; 21:1656–1669 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Liu M, Jiang Y, Wedow R, et al. : Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nat Genet 2019; 51:237–244 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Kranzler HR, Zhou H, Kember RL, et al. : Genome-wide association study of alcohol consumption and use disorder in 274,424 individuals from multiple populations. Nat Commun 2019; 10:1499. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Zhou H, Sealock JM, Sanchez-Roige S, et al. : Meta-analysis of problematic alcohol use in 435,563 individuals identifies 29 risk variants and yields insights into biology, pleiotropy and causality. Nat Neurosci 2019; 738088 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Sanchez-Roige S, Palmer AA, Fontanillas P, et al. : Genome-Wide Association Study Meta-Analysis of the Alcohol Use Disorders Identification Test (AUDIT) in Two Population-Based Cohorts. Am J Psychiatry 2019; 176:107–118 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Saunders JB, Aasland OG, Babor TF, et al. : Development of the Alcohol Use Disorders Identification Test (AUDIT): WHO Collaborative Project on Early Detection of Persons with Harmful Alcohol Consumption--II. Addict Abingdon Engl 1993; 88:791–804 [DOI] [PubMed] [Google Scholar]

[R8] 8.Litten RZ, Ryan ML, Falk DE, et al. : Heterogeneity of Alcohol Use Disorder: Understanding Mechanisms to Advance Personalized Treatment. Alcohol Clin Exp Res 2015; 39:579–584 [DOI] [PubMed] [Google Scholar]

[R9] 9.Grotzinger AD, Rhemtulla M, de Vlaming R, et al. : Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat Hum Behav 2019; 3:513–525 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Boyd A, Golding J, Macleod J, et al. : Cohort Profile: The ‘Children of the 90s’—the index offspring of the Avon Longitudinal Study of Parents and Children. Int J Epidemiol 2013; 42:111–127 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Boomsma DI, Vink JM, Beijsterveldt TCEM van, et al.: Netherlands Twin Register: A Focus on Longitudinal Research. Twin Res Hum Genet 2002; 5:401–406 [DOI] [PubMed] [Google Scholar]

[R12] 12.Bycroft C, Freeman C, Petkova D, et al. : The UK Biobank resource with deep phenotyping and genomic data. Nature 2018; 562:203–209 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Loh P-R, Tucker G, Bulik-Sullivan BK, et al. : Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet 2015; 47:284–290 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Jiang L, Zheng Z, Qi T, et al. : A resource-efficient tool for mixed model association analysis of large-scale data. Nat Genet 2019; 51:1749–1755 [DOI] [PubMed] [Google Scholar]

[R15] 15.Chang CC, Chow CC, Tellier LC, et al. : Second-generation PLINK: rising to the challenge of larger and richer datasets [Internet]. GigaScience 2015; 4[cited 2020 Jul 23] Available from: https://academic.oup.com/gigascience/article/4/1/s13742-015-0047-8/2707533 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Mallard TT, Linnér RK, Grotzinger AD, et al. : Multivariate GWAS of psychiatric disorders and their cardinal symptoms reveal two dimensions of cross-cutting genetic liabilities. bioRxiv 2020; 603134 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Willer CJ, Li Y, Abecasis GR: METAL: fast and efficient meta-analysis of genomewide association scans. Bioinforma Oxf Engl 2010; 26:2190–2191 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Rosseel Y: lavaan: An R Package for Structural Equation Modeling. J Stat Softw 2012; 48:1–36 [Google Scholar]

[R19] 19.Bulik-Sullivan BK, Loh P-R, Finucane HK, et al. : LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet 2015; 47:291–295 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Grotzinger AD, Rhemtulla M, de Vlaming R, et al. : Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat Hum Behav 2019; 3:513–525 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Linnér RK, Mallard TT, Barr PB, et al. : Multivariate genomic analysis of 1.5 million people identifies genes related to addiction, antisocial behavior, and health [Internet]. Genetics, 2020[cited 2020 Dec 14] Available from: http://biorxiv.org/lookup/doi/10.1101/2020.10.16.342501 [Google Scholar]

[R22] 22.Mallard TT, Linnér RK, Grotzinger AD, et al. : Multivariate GWAS of psychiatric disorders and their cardinal symptoms reveal two dimensions of cross-cutting genetic liabilities [Internet]. Genetics, 2019[cited 2020 Dec 14] Available from: http://biorxiv.org/lookup/doi/10.1101/603134 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Cross-Disorder Group of the Psychiatric Genomics Consortium. Electronic address: plee0@mgh.harvard.edu, Cross-Disorder Group of the Psychiatric Genomics Consortium: Genomic Relationships, Novel Loci, and Pleiotropic Mechanisms across Eight Psychiatric Disorders. Cell 2019; 179:1469–1482.e11 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] 24.Watanabe K, Taskesen E, van Bochoven A, et al. : Functional mapping and annotation of genetic associations with FUMA. Nat Commun 2017; 8:1826. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.de Leeuw CA, Mooij JM, Heskes T, et al. : MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput Biol 2015; 11:e1004219. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Sey NYA, Hu B, Mah W, et al. : A computational tool (H-MAGMA) for improved prediction of brain-disorder risk genes by incorporating brain chromatin interaction profiles. Nat Neurosci 2020; 23:583–593 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Consortium GTEx, Barbeira AN Dickinson SP, et al. : Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat Commun 2018; 9:1825. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.Ge T, Chen C-Y, Ni Y, et al. : Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun 2019; 10:1776. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] 29.Begleiter: The Collaborative Study on the Genetics of Alcoholism. Alcohol Health Res World 1995; 19:228–236 [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Messer LC, Laraia BA, Kaufman JS, et al. : The Development of a Standardized Neighborhood Deprivation Index. J Urban Health 2006; 83:1041–1062 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] 31.Dennis J, Sealock J, Levinson RT, et al. : Genetic risk for major depressive disorder and loneliness in gender-specific associations with coronary artery disease: supplementary [Internet]. Genetics, 2019[cited 2019 Oct 8] Available from: http://biorxiv.org/lookup/doi/10.1101/512541 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Carroll RJ, Bastarache L, Denny JC: R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinformatics 2014; 30:2375–2376 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Evangelou E, Gao H, Chu C, et al. : New alcohol-related genes suggest shared genetic mechanisms with neuropsychiatric disorders. Nat Hum Behav 2019; 3:950–961 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Volkow ND, Morales M: The Brain on Drugs: From Reward to Addiction. Cell 2015; 162:712–725 [DOI] [PubMed] [Google Scholar]

[R35] 35.Hinney A, Albayrak O, Antel J, et al. : Genetic variation at the CELF1 (CUGBP, elav-like family member 1 gene) locus is genome-wide associated with Alzheimer’s disease and obesity. Am J Med Genet Part B Neuropsychiatr Genet Off Publ Int Soc Psychiatr Genet 2014; 165B:283–293 [DOI] [PubMed] [Google Scholar]

[R36] 36.Davies G, Marioni RE, Liewald DC, et al. : Genome-wide association study of cognitive functions and educational attainment in UK Biobank ( N =112 151). Mol Psychiatry 2016; 21:758–767 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Savage JE, Jansen PR, Stringer S, et al. : Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nat Genet 2018; 50:912–919 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] 38.Marees AT, Smit DJA, Ong J-S, et al. : Potential influence of socioeconomic status on genetic correlations between alcohol consumption measures and mental health. Psychol Med 2020; 50:484–498 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] 39.Fry A, Littlejohns TJ, Sudlow C, et al. : Comparison of Sociodemographic and Health-Related Characteristics of UK Biobank Participants With Those of the General Population. Am J Epidemiol 2017; 186:1026–1034 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] 40.Gelernter J, Sun N, Polimanti R, et al. : Genome-wide Association Study of Maximum Habitual Alcohol Intake in >140,000 U.S. European and African American Veterans Yields Novel Risk Loci. Biol Psychiatry 2019; 86:365–376 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] 41.Zorrilla EP, Logrip ML, Koob GF: Corticotropin releasing factor: A key role in the neurobiology of addiction. Front Neuroendocrinol 2014; 35:234–244 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] 42.Johnson EC, Sanchez-Roige S, Acion L, et al. : Polygenic contributions to alcohol use and alcohol use disorders across population-based and clinically ascertained samples. Psychol Med 2020; 1–10 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] 43.Munafò MR, Tilling K, Taylor AE, et al. : Collider scope: when selection bias can substantially influence observed associations. Int J Epidemiol 2018; 47:226–235 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] 44.Xue A, Jiang L, Zhu Z, et al. : Genome-wide analyses of behavioural traits biased by misreports and longitudinal changes [Internet]. Genetic and Genomic Medicine, 2020[cited 2020 Jul 17] Available from: http://medrxiv.org/lookup/doi/10.1101/2020.06.15.20131284 [Google Scholar]

[R45] 45.Mostafavi H, Harpak A, Agarwal I, et al. : Variable prediction accuracy of polygenic scores within an ancestry group. eLife 2020; 9:e48376 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] 46.Sanchez-Roige S, Palmer AA: Emerging phenotyping strategies will advance our understanding of psychiatric genetics. Nat Neurosci 2020; 23:475–480 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Item-level genome-wide association study of the Alcohol Use Disorders Identification Test in three population-based cohorts

Travis T Mallard, M.A

Jeanne E Savage, Ph.D

Emma C Johnson, Ph.D

Yuye Huang, Ms

Alexis C Edwards, Ph.D

Jouke J Hottenga, Ph.D

Andrew D Grotzinger, M.A

Daniel E Gustavson, Ph.D

Mariela V Jennings, BSc

Andrey Anokhin, Ph.D

Danielle M Dick, Ph.D

Howard J Edenberg, Ph.D

John R Kramer, Ph.D

Dongbing Lai, Ph.D

Jacquelyn L Meyers, Ph.D

Ashwini K Pandey, Ph.D

Kathryn Paige Harden, Ph.D

Michel G Nivard, Ph.D

Eco JC de Geus, Ph.D

Dorret I Boomsma, Ph.D

Arpana Agrawal, Ph.D

Lea K Davis, Ph.D

Toni-Kim Clarke, Ph.D

Abraham A Palmer, Ph.D

Sandra Sanchez-Roige, Ph.D

Abstract

Objective.

Methods.

Results.

Conclusions.

INTRODUCTION

MATERIALS AND METHODS

Discovery samples and phenotype construction

Univariate genome-wide association and meta-analyses

Phenotypic and genetic correlations

Phenotypic and genetic factor analysis

Multivariate genome-wide association analyses

Biological annotation, gene and transcriptome-based association analyses

Polygenic risk score analyses

Prediction of alcohol phenotypes in UK Biobank and COGA.

Phenome-wide association study in BioVU.

RESULTS

Phenotypic and genetic analyses reveal a consistent two-factor structure of alcohol consumption and problematic use

Figure 1.

Latent variable approach characterizes and ameliorates bias in GWAS of alcohol consumption

Figure 2.

Multivariate GWAS confirm a distinct genetic basis between alcohol consumption and misuse

Figure 3.

Polygenic risk analyses

UK Biobank.

Figure 4.

COGA.

BioVU.

Figure 5.

DISCUSSION

Supplementary Material

ACKNOWLEDGEMENTS

DATA AVAILABILITY

REFERENCES

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases