Abstract
Background:
The study of alcohol use frequency utilizes alcohol-related cue imagery. Although a number of alcohol image databases currently exist, they have several limitations: many are not publicly available, some use stock images or clip-art rather than real photographs, several eliminate any photographs displaying brand information, and predominantly they contain relatively few images. The aim of this project was to develop a large, open-access database of alcohol-related cue images, containing photographs with and without brand information, taken in real-world environments, with images in a variety of orientations and dimensions.
Methods:
The study collected 1,650 images voluntarily from the larger community, to capture photographs with a wide range of content, environments, and relation to alcohol. All images were then rated on scales of valence, arousal, and relation to alcohol by 1,008 Amazon Mechanical Turk workers, using classical emotion validation methods based on the International Affective Picture System (IAPS). Survey respondents were screened with the Alcohol Use Disorders Identification Test (AUDIT), and Cronbach’s alpha scores were calculated to determine the interrater reliability of scores across the whole sample, and with-in low-risk, moderate-risk, and high-risk drinkers for each rating domain. Univariate ANOVA were run to determine differences in ratings across drinking groups.
Results:
All Cronbach’s alpha scores indicated high interrater reliability within the whole sample, and across drinking severity groups. Tukey’s HSD post-hoc results indicated greater arousal and affect in response to image viewing in moderate and high-risk drinkers, and higher relation to alcohol ratings in low-risk drinkers. All images had categorization tags assigned by members of the study team.
Conclusion:
The established imagery set includes 1,650 alcohol-related images, rated on scales of valence, arousal, and relation to alcohol, and categorized by type of alcohol depicted. The images are available for use and download through Google Photos at https://photos.app.goo.gl/LsJrAHFXhpSAokrn6.
Keywords: alcohol, image, picture, cue, validation
Introduction
Studies of alcohol use frequently utilize alcohol-related cue imagery to examine approach or avoidance behaviors (Krueger, 2015), to induce craving (Austin et al., 2006, Grüsser et al., 2000, Lovett et al., 2015), or to examine psychometric differences between problematic and non-problem drinkers (Kreusch et al., 2013, Lee et al., 2006) among other research aims. Given that regular alcohol consumption is widespread across the world (GBD-AC, 2018) and that alcohol use disorder (AUD) is a devastating condition with minimally effective treatments, the need for well validated, widely available images to examine alcohol use and misuse in a variety of settings is a paramount methodological issue. Several alcohol-related image databases currently exist (Billieux et al., 2011, Fey et al., 2016, Grüsser et al., 2000, Krueger, 2015, Lee et al., 2006, López-Caneda and Carbia, 2018, Lovett et al., 2015, Pronk et al., 2015, Staugger et al., 2017), unfortunately there are limitations to many of these that affect their generalizability to other research studies. Many of the image sets are not readily available to the public for various reasons (Lovett et al., 2015, Lee et al., 2006, Fey et al., 2016) and at least one was created to address the specific needs of particular study designs (i.e. only depicting certain kinds of alcoholic beverages) (Lee et al., 2006).
The composition of the images is an issue that limits the generalizability of the image sets. At least one image set is comprised of clip-art images with no brand information, actors, or settings (Krueger, 2015) and several use stock photographs of individuals holding or interacting with alcohol beverages (López-Caneda and Carbia, 2018, Pronk et al., 2015, Kreusch et al., 2013). Research has shown that heavy drinkers experience stronger craving and are more physiologically aroused by alcohol images taken in real drinking contexts rather than animated or staged images (Lee et al., 2006, López-Caneda and Carbia, 2018, Nees et al., 2012), which may help explain why brand loyalty increases interest in specific types of alcohol. This finding suggests that a highly successful alcohol-research imagery tool would contain images related to alcohol, photographed in a variety of real-life settings depicting various behaviors, some containing brand information and others without specific branding.
The number of images available can also be a major limitation for small image sets (Billieux et al., 2011). The paradigms that utilize these alcohol-related cue images often require the presentation of a high volume of images or multiple presentations of cues across separate time points or tasks. For example, our research group presents alcohol-related images during a seven-minute functional magnetic resonance imaging (fMRI) scan; with each image presented for 10 seconds, one session requires 42 images. If we wish to scan participants at multiple time points, for instance pre- and post-intervention, nearly 100 images would be needed. The established databases, some with only 40 or 60 images, would not fulfill these research needs without repeated presentation of images.
Finally, alcohol-image databases with relatively few images require that all participants view the same images, regardless of the content of the image or the participant’s preference for alcohol. This means that a participant who only drinks beer will view images of wine, liquor, and mixed drinks in addition to images related to beer, which may not induce the desired effect, since thoughts of wine or liquor may not induce their desire for beer. A research-oriented alcohol-image set should be large enough that there are sufficient numbers of images in a variety of categories (such as beer, wine, liquor, at home, in bars, or in shopping environments) that participants can view an entire set of images that pertain to them. The primary goal of this project was to build such an image dataset and produce validated image ratings. Researchers or participants would be able choose the subset of images that will be most useful to them, whether that distinction is based on image content, the presence or absence of brand information, or the ratings of the images.
The central aim of this project was to establish a large, open-access database of alcohol-related cue images depicting multiple types of alcohol, with and without brand information, taken in real-world environments, in a variety of orientations and dimensions. Using an internet-based data collection system known as MTurk, approximately 147 people rated each image, when the number of raters were averaged across all images. Images were collected by non-scientists in the community who contribute to the scientific field, known as citizen scientists. Each photograph was rated using classical emotional picture valence methods developed by the Center for the Study of Emotion and Attention (CSEA-NIMH, 2002). Cronbach’s alpha scores were calculated to determine interrater reliability of the rating scales across the entire participant sample, as well as among low-risk, moderate-risk, and high-risk alcohol consumers, and univariate ANOVA determined differences in ratings across these drinking severity groups. The images are available for use and download through Google Photos at https://photos.app.goo.gl/LsJrAHFXhpSAokrn6.
Materials and Methods
Participants
Crowdsourcing platforms are rapidly gaining ground in the scientific community (Difallah et al., 2015), as they allow for the recruitment and testing of many highly diverse participants across wide geographic ranges in short time frames. Amazon’s Mechanical Turk (MTurk) is a common crowdsourcing platform used by a growing number of psychologists and other researchers. MTurk responses have been demonstrated as nearly indistinguishable from responses collected from in-person surveys and social media posted surveys, even within highly socioeconomically and ethnically diverse samples (Casler et al., 2013). As such, MTurk is an ideal tool for collecting many ratings from a highly diverse population of responders. This study utilized MTurk to collect a high volume of ratings rapidly for this large set of images.
All image raters were MTurk workers. Responders were limited to workers over the age of 18, who were primarily English speaking, and located in the United States. We did not collect specific demographic information on our sample, but overall MTurk workers have been reported to be well balanced by gender (overall 51% female); younger than the general population (20% born after 1990, 60% born after 1980, and 80% born after 1970); 40% single, 42% married, and 5% divorced; and having household incomes below the national average (median MTurk worker household income of $47K, median US household income of $57K) with approximately 7,500 active MTurk workers at any given time point (Difallah et al., 2018).
All survey respondents completed a digital informed consent, approved by the Wake Forest Health Sciences Institutional Review Board.
Measures
The Alcohol Use Disorders Identification Test (AUDIT) questionnaire is a 10-item self-inventory that was designed to identify harmful or hazardous patterns of drinking (Babor et al., 2001). Items 1–3 quantify individual differences in alcohol consumption patterns, items 4–6 assess difficulty managing drinking impulses, and items 7–10 capture alcohol-related problems and dangers (Babor et al., 2001, Lang et al., 2015, Gache et al., 2005). The first 8 items are scored 0–4, and the final two items are scored 0, 2, or 4, for a maximum possible AUDIT score of 40. The World Health Organization states that scores below 6 for women and 7 for men demonstrate low-risk consumption, while scores of 8–15 generally demonstrate a moderate level of alcohol related problems (where consumers should seek advice for the reduction of hazardous drinking), and scores above 16 suggest high-risk or hazard drinking where counseling and diagnostic evaluation should be sought. We also included an 11th question with the AUDIT, probing what type(s) of alcohol respondents typically consume if they drink alcohol. The AUDIT was used to categorize survey respondents as low-risk drinkers (scores 7 or below), moderate-risk drinkers (scores 8–15), and high-risk drinkers (scores 16 or higher). Although the AUDIT does have specific cut-offs for males and females, gender was not collected from our MTurk image raters, so overall cut-offs of 7 and 15 were chosen to separate low and moderate-risk drinkers and moderate and high-risk drinkers. Average AUDIT responses by question and by drinking severity can be found in the Supplemental Materials.
Procedure
Image Collection
Citizen science involves having community volunteers collect or process data in order to “broaden the scope of research and enhance the ability to collect scientific data” (Cohen, 2008, Silvertown, 2009). We utilized the growing citizen science trend to collect our alcohol images; by amassing images from many members of the community we were able to collect images in a wide variety of settings and environments, depicting a wide range of alcohol beverages and cues, ranging from craft beers in pop-up microbreweries, to wine glasses in vineyards, shots of liquor in bars, and bottle openers and shelves of glasses in home kitchens. Images were collected using a Google Drive survey distributed via social media by members of the study team. Instructions for anyone wishing to contribute images are presented in the Supplemental Materials. Study staff approved all submitted photographs with the following criteria: images were related to or depicted alcohol beverages, drinking behaviors, or alcohol cues, and displayed no prominent faces, features that could identify the photographer or anyone depicted in the photographs, or inappropriate content. Any photographs that did not meet these criteria, or that were overly blurry, were removed from the set. All others were assigned a numerical name, based on their order of submission. Nearly 1,700 images were submitted; a total of 1,650 remained after image approval. Sample images can be seen in Figure 1. The images are available for use and download through Google Photos at https://photos.app.goo.gl/LsJrAHFXhpSAokrn6.
Figure 1.
Sample Images from the WFAIS.
Image Categorization
Categorizations were collected from three members of the study team. All images were classified as depicting or being related to beer, wine, red wine, white wine, champagne, liquor, mixed drink/cocktail, cider, accessories/environment, or other. All categorizers were instructed to select all choices that were applicable to each image. For any image where it was appropriate, categorizers were also given the opportunity to provide a more detailed categorization including type of beverage (i.e. martini, margarita, Moscow Mule, mimosa, etc.) or type of alcohol (i.e. vodka, gin, IPA, whiskey, lager, etc.), as well as any brand information. All categorizations are attached to individual images as features, which can be searched or identified in a key word search of the database. Image categorization instructions and a full list of categorizations are available in the Supplemental Materials.
Surveys
Surveys were created in, managed, and administered through REDCap (Research Electronic Data Capture) electronic data captures tools hosted at Wake Forest University (Harris et al., 2009). REDCap is a secure, web-based application designed to support data capture for research studies, providing (1) an intuitive interface for validated data entry, (2) audit trails for tracking data manipulation and export procedures, (3) automated export procedures for seamless data downloads to common statistical packages, and (4) procedures for importing data from external sources (Harris et al., 2009). Each rating survey was comprised of general instructions, an informed consent (approved by the Wake Forest Health Sciences Institutional Review Board), the adapted AUDIT questionnaire, and specific rating instructions (adapted from (Billieux et al., 2011)) (Supplemental Materials).
The International Affective Picture System (IAPS) is “a set of normative emotional stimuli for experimental investigations” (CSEA-NIMH, 2002). The IAPS contains a large number of emotionally-evocative images, as well as neutral images, across a wide range of semantic categories, all of which are internationally available. All of the IAPS images have been rated on dimensions of valence, arousal, and dominance. The IAPS images are widely used in a variety of study settings, particularly the neutral images, for comparison to task images. As such, the rating system established with IAPS is standard for equivalently produced databases. We elected to use only the valence and arousal dimensions of this standard rating procedure, as we are unaware of any alcohol-related studies that select images from the IAPS based solely on their dominance ratings. Instead, we created a third scale assessing relation to alcohol, in order to verify that all of all our images evoke thoughts of alcohol when viewed, and assess to what degree.
A total of 11 surveys were created from the 1,650 submitted images. Each survey included 150 images each with the three corresponding rating scales (valence, arousal, and relation to alcohol), as well as two validity checks. The eleven sets of 150 images are sequentially labeled as Image Set A-K (see Table 1) for the remainder of the document. The rating scales asked, “How happy-unhappy does this image make you feel?”, “How excited-calm does this image make you feel?”, and “How related to alcohol is this image?”. Each rating scale ranged from 0–100, with lower values indicating higher valence and arousal. More precisely, the valence scale varied from feeling happy, pleased, satisfied, contented, or hopeful (a value of 0) to feeling unhappy, annoyed, unsatisfied, melancholic, despaired, or bored (a value of 100) and the arousal scale varied from feeling excited, stimulated, frenzied, jittery, wide-awake, or aroused (a value of 0) to feeling calm, relaxed, sluggish, dull, sleepy, or unaroused (a value of 100). The relation to alcohol scale varied from no relation to alcohol (a value of 0) to high relation to alcohol (a value of 100).
Table 1.
Image Set names based on image numbers. The numbers correspond to the final naming structure in the image database.
| Image Numbers | Image Set Name |
|---|---|
| 00001–00150 | Image Set A |
| 00151–00300 | Image Set B |
| 00301–00450 | Image Set C |
| 00451–00600 | Image Set D |
| 00601–00750 | Image Set E |
| 00751–00900 | Image Set F |
| 00900–01050 | Image Set G |
| 01051–01200 | Image Set H |
| 01201–01350 | Image Set I |
| 01351–01500 | Image Set J |
| 01501–01650 | Image Set K |
The two validity checks asked, “What day falls between Tuesday and Thursday?” with multiple choice responses of “Monday”, “Wednesday”, “Friday”, and “Sunday” and “Please retype the sixth word from the following sentence: the quick brown fox jumped over the lazy red dogs.”. These checks were intended to screen for bots, and anyone rapidly and arbitrarily selecting responses. All responders who failed either or both of these checks were excluded from analysis as detailed in the results section. A sample image and associated questions extracted from the survey can be found in Figure 2.
Figure 2.
Sample Survey page extracted from an Image Rating Survey.
Each of the 11 surveys were posted on MTurk in small batches, collecting nine responses at a time in order to collect ratings on all 1,650 images while limiting participant burden, and simplifying data management. As batches were completed, surveys were reposted, such that all 11 surveys were always available, while responses were being collected. All MTurk listings were approved by the Wake Forest Health Science Institutional Review Board. Once a worker initiated the assignment, they were redirected to the appropriate REDCap survey, and allowed 24-hours to complete the survey. Each survey concluded with a response code, which responders were asked to list in an MTurk response field as proof of survey completion. All raters were paid $2 for successful completion of a survey. Any responder who did not provide the appropriate response code was not paid or their participation. A total of 1,008 MTurk workers successfully completed at least one survey. Survey respondents were able to complete each survey at most once, but were allowed to complete as many or as few of the 11 surveys as they wished.
Analysis
Participants were classified as low-risk drinkers (scores 7 or below), moderate-risk drinkers (scores 8–15) or high-risk drinkers (scores 16 or above) based on their responses to the standard AUDIT questionnaire. All incomplete surveys (missing either AUDIT responses or image ratings) were removed from analysis. Any complete survey without correct responses to both validity checks was removed from analysis. Table 2 lists how many responders completed each survey on average, for surveys A-K. Cronbach’s alpha scores were calculated in SPSS v.24 for each rating scale (valence, arousal, and relation to alcohol) for each of the 11 surveys, for the whole group, and for low-risk, moderate-risk, and high-risk drinkers. This method allowed us to avoid any complication from individual raters completing more than one survey. Cronbach’s alpha scores above 0.7 are considered to have satisfactory internal consistency within a group of raters.
Table 2.
Total completed surveys reflect the number of surveys with complete data. Check 1 was a multiple-choice check, and Check 2 required rewriting a single word from a sentence. Any raters who failed one or both checks were removed from analysis. Raters classified as moderate or high-risk drinkers who only indicated they did not typically consume alcohol were removed from analysis. Total Responses reflect the number of ratings remaining for each survey after these failures were removed. Frequencies are listed as counts and percentages. Full counts for individual image sets can be found in the Supplemental Tables.
| Total Completed Surveys | Failed Check 1 | Failed Check 2 | Failed Both Checks | Removed for Drinking Classification | Total Responses | Low-Risk Drinkers | Moderate-Risk Drinkers | High-Risk Drinkers | |
|---|---|---|---|---|---|---|---|---|---|
| Averages (all Image Sets) | 182.3 | 15.9 (8.7%) | 26.1 (14.3%) | 7.8 | 0.3 | 147.8 | 36.7 (24.8%) | 34.8 (23.5%) | 76.5 (51.7%) |
Univariate ANOVA were run to determine between group differences in ratings. ANOVA were run in SPSS v.24 for each survey to avoid complications from individuals who rated more than one survey. Average valence, arousal, and alcohol ratings served as dependent variables, with drinking severity as the fixed factor. Tukey’s Honest Significant Difference (HSD) tests were run post-hoc to examine significant differences between drinking severity groups for any significant ANOVA findings.
Results
Sample images from the dataset can be found in Figure 1. A total of 1,008 MTurk workers completed surveys and submitted correct completion codes. Raters were able to take as many of the 11 surveys as they wished, although most elected to complete only one survey. Of the 1008 total raters, 608 raters (60.3%) completed only 1 survey, 190 raters (18.8%) completed 2 surveys, 68 raters (6.7%) completed 3 surveys, 36 raters (3.6%) completed 4 surveys, 38 raters (3.8%) completed 5 surveys, 15 raters (1.5%) completed 6 surveys, 22 raters (2.2%) completed 7 surveys, 11 raters (1.1%) completed 8 surveys, 10 raters (0.9%) completed 9 surveys, 6 raters (0.6%) completed 10 surveys, and 4 raters (0.4%) completed all 11 surveys. See Table 2 for average numbers of failed checks, drinking-risk classifications, and complete numbers of surveys across all surveys. Counts for each individual survey can be found in the Supplemental Material. An average of 15.9 or 8.7% of raters failed the multiple-choice check, and an average of 26.1 or 14.3% of raters failed the written check. An average of 7.8 raters failed both checks. On average, 147.8 responses remained for each survey after removing incomplete responses and any response with at least one failed check. An average of 36.7 or 24.8% of raters fell into the low-risk drinking category, an average of 34.8 or 23.5% of raters fell into the moderate-risk drinking category, and an average of 76.5 or 51.7% of raters fell into the high-risk drinking category. Average responses to each of the 10 standard AUDIT items for each of the drinking severity groups can be found in the Supplemental Materials.
Raters were also categorized by the type of alcohol they typically consume. Each rater was asked to check all choices that described their typical alcohol consumption. Any participants who were categorized as moderate or high-risk drinkers but only checked the drinking response “I do not typically consume alcohol” were removed from analysis. If they checked the “I do not typically consume alcohol” choice in addition to other responses, they were included in analysis, as this response could demonstrate consumption only on a few days of the week, or not regularly. Classifications of raters by typical alcohol consumption are in the Supplemental Materials, and average classifications for the full sample can be found in Table 3. 34.3% of raters were typical beer consumers, 31.3% of raters were typical wine consumers, 23.9% of raters were typical liquor consumers, 6.7% of raters were typical cider consumers, 5.6% of raters typically consumed other types of alcohol, and 2.7% of raters did not typically consumer alcohol. Each rater could fall into more than one of these categorizes, as they were asked to check all choices which applied to them.
Table 3.
Raters indicated the type(s) of alcohol they typically consume in an 11th item added to the AUDIT questionnaire. Raters in the moderate or high-risk categories who indicated they did not typically consume alcohol indicated at least one other response. Frequencies are listed as counts for individual surveys, and as percentages for the full sample of responders. Full frequencies for individual Image Sets can be found in the Supplemental Tables.
| Type of Alcohol Typically Consumed | Low-Risk Drinkers | Moderate-Risk Drinkers | High-Risk Drinkers |
|---|---|---|---|
| Total Averages | |||
| Beer | 31.4% | 37.8% | 33.6% |
| Wine | 26.8% | 32.8% | 34.3% |
| Liquor | 26.6% | 23.1% | 22.1% |
| Cider | 7.4% | 4.8% | 7.9% |
| Other | 2.8% | 2.2% | 2% |
| Do Not Typically Consume Alcohol | 6.2% | 1.4% | 0.6% |
Average ratings for the full image set, for all raters and for low-risk, moderate-risk, and high-risk drinkers can be found in the Supplemental Materials. Image categorizations can also be found in the Supplemental Materials. It is important to note that the valence and arousal ratings were based on reversed scales. Thus, larger scores were associated with lower valence (i.e. greater unhappiness) and arousal (i.e. more calm feelings). The alcohol ratings were not reverse scored so higher ratings indicate greater relation to alcohol. For all images, the average valence rating was 39.106 for low-risk drinkers, 31.347 for moderate-risk drinkers, and 31.027 for high-risk drinkers, with an overall average valence rating of 33.827. The average arousal rating was 48.014 for low-risk drinkers, 36.955 for moderate-risk drinkers, and 37.999 for high risk drinkers, with an overall average arousal rating of 40.989. The average relation to alcohol rating was 71.104 for low-risk drinkers, 64.123 for moderate-risk drinkers, and 61.559 for high-risk drinkers, with an overall average alcohol relation rating of 65.595.
All Cronbach’s alpha values fell above 0.98, indicating high interrater reliability, in all surveys, for the full set of responses, as well as for low-risk, moderate-risk, and high-risk consumers. Cronbach’s alpha values can be found in Table 4.
Table 4.
Cronbach’s alpha values for each rating survey, for the full sample of respondents, as well as for low-risk, moderate-risk, and high-risk drinkers.
| Survey | Valence Cronbach’s Alpha | Arousal Cronbach’s Alpha | Alcohol Cronbach’s Alpha |
|---|---|---|---|
| Full Sample | |||
| Image Set A | 0.988 | 0.990 | 0.991 |
| Image Set B | 0.992 | 0.994 | 0.992 |
| Image Set C | 0.994 | 0.994 | 0.992 |
| Image Set D | 0.994 | 0.994 | 0.990 |
| Image Set E | 0.994 | 0.993 | 0.990 |
| Image Set F | 0.991 | 0.991 | 0.992 |
| Image Set G | 0.993 | 0.994 | 0.994 |
| Image Set H | 0.992 | 0.994 | 0.993 |
| Image Set I | 0.993 | 0.994 | 0.993 |
| Image Set J | 0.993 | 0.991 | 0.994 |
| Image Set K | 0.993 | 0.993 | 0.993 |
| Low-risk | |||
| Image Set A | 0.991 | 0.994 | 0.992 |
| Image Set B | 0.990 | 0.993 | 0.995 |
| Image Set C | 0.987 | 0.988 | 0.988 |
| Image Set D | 0.986 | 0.990 | 0.986 |
| Image Set E | 0.992 | 0.989 | 0.985 |
| Image Set F | 0.988 | 0.991 | 0.995 |
| Image Set G | 0.991 | 0.992 | 0.994 |
| Image Set H | 0.989 | 0.993 | 0.992 |
| Image Set I | 0.988 | 0.989 | 0.987 |
| Image Set J | 0.989 | 0.988 | 0.991 |
| Image Set K | 0.991 | 0.991 | 0.995 |
| Moderate-risk | |||
| Image Set A | 0.988 | 0.992 | 0.991 |
| Image Set B | 0.992 | 0.994 | 0.988 |
| Image Set C | 0.994 | 0.995 | 0.994 |
| Image Set D | 0.995 | 0.994 | 0.987 |
| Image Set E | 0.995 | 0.994 | 0.989 |
| Image Set F | 0.989 | 0.988 | 0.987 |
| Image Set G | 0.990 | 0.993 | 0.994 |
| Image Set H | 0.992 | 0.993 | 0.991 |
| Image Set I | 0.995 | 0.996 | 0.995 |
| Image Set J | 0.990 | 0.989 | 0.995 |
| Image Set K | 0.995 | 0.995 | 0.994 |
| High-risk | |||
| Image Set A | 0.992 | 0.993 | 0.993 |
| Image Set B | 0.992 | 0.994 | 0.992 |
| Image Set C | 0.994 | 0.993 | 0.990 |
| Image Set D | 0.995 | 0.995 | 0.992 |
| Image Set E | 0.993 | 0.993 | 0.992 |
| Image Set F | 0.993 | 0.992 | 0.990 |
| Image Set G | 0.993 | 0.994 | 0.995 |
| Image Set H | 0.994 | 0.994 | 0.994 |
| Image Set I | 0.995 | 0.995 | 0.994 |
| Image Set J | 0.994 | 0.993 | 0.995 |
| Image Set K | 0.991 | 0.991 | 0.990 |
Significant ANOVA results showed differences in ratings of valence, arousal, and relation to alcohol ratings. Full ANOVA and post-hoc test results can be seen in Table 5. Tukey’s HSD post-hoc analyses revealed differences between the drinking severity groups for each survey. For the arousal ratings, Image Sets A-G and K showed higher ratings in the low-risk drinkers than the high-risk drinkers. Image Sets B, F, G, I, and K showed higher ratings in the low-risk drinkers than the moderate-risk drinkers. Image Sets H and J showed no significant differences between drinking groups (see Table 5). For the valence ratings, Image Sets A-H and showed higher ratings in low-risk drinkers than in high-risk drinkers. Image Sets A, B, D, F,G, I, and K showed higher ratings in the low-risk drinkers than the moderate-risk drinkers. Image Set J showed no significant differences between drinking groups (see Table 5). For the relation to alcohol ratings, Image Sets A-C and F-K showed higher ratings in the low-risk drinkers than the high-risk drinkers. Image Sets A, B, F, H, and K showed higher ratings in low-risk drinkers than in moderate-risk drinkers. Image Sets D and E showed no significant differences between drinking groups (see Table 5). Again, note that lower valence and arousal scores correspond to greater feelings of happiness and calm, respectively. Lower relation to alcohol ratings correspond to less relation to alcohol.
Table 5.
ANOVA and post-hoc Tukey’s HSD results. Significant results are bolded. Lower arousal and valence scores correspond to greater happiness and excitement. Higher relation to alcohol scores indicated greater relevance to alcohol. Low and Mod, Low and High, and Mod and High columns indicated post-hoc Tukey’s HSD results, indicating p-values associated with mean differences between the listed groups.
| Survey | ANOVA | Low-Risk Mean (SD) | Mod-Risk Mean (SD) | High-Risk Mean (SD) | Low and Mod | Low and High | Mod and High |
|---|---|---|---|---|---|---|---|
| Arousal | |||||||
| Image Set A | F = 6.325, p = 0.002 | 32.441 (2.177) | 32.046 (2.294) | 29.892 (1.611) | p = 0.054 | p = 0.002 | p = 0.723 |
| Image Set B | F = 8.885, p < 0.001 | 42.318 (2.525) | 29.166 (2.525) | 30.604 (1.760) | p = 0.001 | p = 0.001 | p = 0.887 |
| Image Set C | F = 4.493, p = 0.013 | 38.226 (2.276) | 32.685 (2.475) | 29.851 (1.620) | p = 0.229 | p = 0.009 | p = 0.604 |
| Image Set D | F = 4.088, p = 0.019 | 37.339 (2.597) | 31.252 (2.999) | 28.361 (1.765) | p = 0.278 | p = 0.014 | p = 0.685 |
| Image Set E | F = 4.193, p = 0.017 | 41.775 (2.813) | 34.989 (2.729) | 32.004 (1.863) | p = 0.197 | p = 0.012 | p = 0.639 |
| Image Set F | F = 5.501, p = 0.005 | 43.132 (2.196) | 34.267 (2.286) | 34.780 (1.663) | p = 0.016 | p = 0.008 | p= 0.982 |
| Image Set G | F = 7.209, p = 0.001 | 38.586 (2.580) | 29.273 (2.235) | 27.240 (1.533) | p = 0.019 | p = 0.001 | p = 0.734 |
| Image Set H | F = 3.048, p = 0.051 | 36.677 (2.211) | 35.419 (2.270) | 30.531 (1.627) | p = 0.917 | p = 0.680 | p = 0.190 |
| Image Set I | F = 3.168, p = 0.045 | 36.735 (2.433) | 27.971 (2.561) | 31.525 (1.731) | p = 0.037 | p = 0.192 | p = 0.485 |
| Image Set J | F = 2.306, p = 0.103 | 36.201 (2.485) | 30.239 (2.691) | 30.105 (1.545) | p = 0.237 | p = 0.097 | p = 0.999 |
| Image Set K | F = 6.725, p = 0.002 | 39.739 (2.416) | 27.510 (2.448) | 31.519 (1.720) | p = 0.001 | p = 0.017 | p = 0.375 |
| Valence | |||||||
| Image Set A | F = 6.643, p = 0.002 | 50.376 (2.859) | 37.623 (3.014) | 38.512 (2.116) | p = 0.007 | p = 0.003 | p = 0.968 |
| Image Set B | F = 10.090, p < 0.001 | 52.781 (3.071) | 33.950 (3.071) | 39.775 (2.141) | p < 0.001 | p = 0.002 | p = 0.268 |
| Image Set C | F = 4.562, p = 0.012 | 47.012 (2.677) | 38.668 (2.910) | 37.265 (1.905) | p = 0.091 | p = 0.010 | p = 0.914 |
| Image Set D | F = 7.370, p = 0.001 | 49.392 (3.003) | 37.430 (3.467) | 35.665 (2.040) | p = 0.027 | p = 0.001 | p = 0.899 |
| Image Set E | F = 4.015, p = 0.020 | 47.343 (2.809) | 39.425 (2.725) | 37.908 (1.860) | p = 0.111 | p = 0.016 | p = 0.890 |
| Image Set F | F = 7.061, p = 0.001 | 51.037 (2.474) | 42.332 (2.575) | 39.460 (1.874) | p = .042 | p = .001 | p = .640 |
| Image Set G | F = 6.599, p = .002 | 47.635 (2.258) | 34.122 (2.822) | 34.626 (1.936) | p = 0.006 | p = 0.002 | p = 0.988 |
| Image Set H | F = 3.140, p = 0.046 | 48.221 (2.767) | 42.312 (2.841) | 39.618 (2.036) | p = 0.299 | p = 0.035 | p = 0.722 |
| Image Set I | F = 4.268, p = 0.016 | 43.745 (2.823) | 31.778 (2.972) | 38.389 (2.008) | p = 0.011 | p = 0.272 | p = 0.159 |
| Image Set J | F = 1.234, p = 0.294 | 42.740 (2.782) | 36.431 (3.012) | 38.986 (1.729) | p = 0.276 | p = 0.487 | p = 0.743 |
| Image Set K | F = 9.035, p < 0.001 | 47.875 (2.616) | 32.432 (2.651) | 37.788 (1.862) | p < 0.001 | p = 0.006 | p = 0.227 |
| Alcohol | |||||||
| Image Set A | F = 9. 462, p < 0.001 | 77.882 (2.506) | 66.039 (2.641) | 64.771 (1.855) | p = 0.004 | p < 0.001 | P = 0.918 |
| Image Set B | F = 6.349, p = 0.002 | 76.098 (2.512) | 67.291 (2.512) | 65.309 (1.751) | p = 0.038 | p = 0.002 | p = 0.794 |
| Image Set C | F = 4.493, p = 0.013 | 38.226 (2.276) | 32.685 (2.475) | 29.851 (1.620) | p = 0.229 | p = 0.009 | p = 0.604 |
| Image Set D | F = 0.618, p = 0.541 | 69.591 (2.383) | 68.144 (2.751) | 66.458 (1.619) | p = 0.917 | p = 0.523 | p = 0.858 |
| Image Set E | F = 1.932, p = 0.149 | 69.975 (2.312) | 69.355 (2.243) | 65.311 (1.531) | p = 0.980 | p = 0.216 | p = 0.299 |
| Image Set F | F = 10.853, p < 0.001 | 73.781 (2.370) | 64.308 (2.467) | 59.948 (1.795) | p = 0.017 | p < 0.001 | p = 0.329 |
| Image Set G | F = 3. 140, p = 0.046 | 75.633 (3.112) | 68.582 (2.695) | 66.572 (1.849) | p = 0.204 | p = 0.035 | p = 0.812 |
| Image Set H | F = 7.612, p = 0.001 | 74.448 (2.408) | 64.188 (2.473) | 63.202 (1.773) | p = 0.010 | p = 0.001 | p = 0.944 |
| Image Set I | F = 4.045, p = 0.019 | 75.441 (2.403) | 68.187 (2.530) | 67.261 (1.710) | p = 0.098 | p = 0.017 | p = 0.951 |
| Image Set J | F = 3.481, p = 0.033 | 73.853 (2.888) | 67.934 (3.127) | 64.897 (1.795) | p = 0.348 | p = 0.025 | p = 0.677 |
| Image Set K | F = 11.007, p < 0.001 | 77.211 (2.371) | 68.641 (2.403) | 63.566 (1.688) | p = 0.032 | p < 0.001 | p = 0.198 |
All images were assigned general categorizations, as well as more fine grained categorizations and brand information, all of which can be found in the Supplemental Materials. Each image was assigned as many categorizations as was appropriate, with the following counts per general category: beer (n = 563), wine (n = 425), white wine (n = 201), red wine (n = 231), liquor (n = 850), mixed drink/cocktail (n = 696), cider (n = 69), champagne (n = 119), and accessories/environment (n = 335).
Discussion
The present study established the Wake Forest Alcohol Image Set (WFAIS), a database of 1,650 alcohol-related images of a variety of sizes and dimensions; collected from members of the community; depicting a wide range of alcohol related beverages, behaviors, and scenarios, captured in real-life environments; and containing assorted brand information. All images were assessed on scales of valence, arousal, and relation to alcohol with methods adapted from the procedure developed by the IAPS (Lang et al., 2015) by low-risk, moderate-risk, and high-risk alcohol consumers. Our results showed high interrater reliability across all three rating domains within the full sample of raters, as well as within each of the three drinking divisions. In addition to presenting these reliability scores, overall average ratings for each scale for each image can be found in the Supplemental Materials. Overall, images were rated with valence around 35, arousal around 40, and relation to alcohol around 65 out of 100. Low-risk drinking raters generally rated feeling less happy and more calm than the moderate and high-risk drinkers while viewing these images (as lower valence and arousal scores indicate greater happiness and excitement). Low-risk drinking raters also generally rated the images as more highly related to alcohol than the moderate or high-risk drinkers. These findings replicate significant differences observed by (Billieux et al., 2011), where non-risky drinkers rated alcohol images as making them feel more in control and less aroused (scored as higher dominance and arousal) than their risky-drinking counterparts.
A major limitation of this study was the lack of demographic information collected from MTurk workers. Without this information, it is impossible to know the specific age, race, sex or gender make-up of this sample of image raters. Based on the general demographic information known for all MTurk workers, as well as the AUDIT scores collected from all raters, we presume that these ratings are generalizable to the general population. We also presume that the high risk drinker’s ratings are generalizable to general heavy alcohol consumers. Additionally, the lack of demographic information prevented the use of AUDIT cut-offs by sex, and instead overall cut-offs based on the traditionally male cut-offs were used.
In conclusion, the WFAIS was created to provide alcohol researchers with a high volume of cue imagery validated using standard methods. The database contains enough images that researchers can select the content that is most relevant for their study or participant needs, depending either on images’ valence, arousal, or relation to alcohol ratings, or on the images’ content or categorization. This image set showed high interrater reliability across all participant divisions. Overall, the alcohol research community will benefit from this well validated, extensive image database. The images are available for use and download through Google Photos at https://photos.app.goo.gl/LsJrAHFXhpSAokrn6.
Supplementary Material
Acknowledgment:
National Institute on Alcohol Abuse and Alcoholism of the National Institutes of Health Under Award Number P50AA026117 and Award Number T32AA007565. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors have no conflicts of interest to disclose.
References
- Austin EW, Chen M & Gruebe JW 2006. How does alcohol advertising influence underage drinking? The role of desireability, identification, and skepticism. Journal of Adolescent Health, 38, 376–384. [DOI] [PubMed] [Google Scholar]
- Babor TF, Higgins-Biddle JC, Saunders JB & Moneiro MG 2001. AUDIT: Alcohol Use Disorders Identification Test, Guildlines fo Use in Primary Care. In: Organization, W. H. (ed.) Second ed.: World Health Organization Department of Mental Health and Substance Dependence. [Google Scholar]
- Billieux J, Khazaal Y, Oliviera S, de Timary P, Edel Y, Zebouni F, Zullino D & Van der Linden M 2011. The Geneva Appetitive Alcohol Pictures (GAAP): Development and preliminary validation. European Addiction Research, 17, 225–230. [DOI] [PubMed] [Google Scholar]
- Casler K, Bickel L & Hackett E 2013. Separate but equal? A comparison of participants and data gathered via Amazon’s MTurk, social media, and face-to-face behavioral testing. Computers in Human Behavior, 29, 156–216. [Google Scholar]
- Cohen JP 2008. Citizen science: Can volunteers do real research?. BioScience, 58, 192–197. [Google Scholar]
- CSEA-NIMH 2002. The International Affective Picture System (photographic slides). In: Psychophysiology, T. C. f. R. i. (ed.). Univeristy of Florida, Gainesville. [Google Scholar]
- Difallah D, Filatova E & Iperiotis P 2018. Demographics and dynamics of Mechanical Turk workers. Proceedings of the Eleventh ACB International Conference on Web Search and Data Mining. [Google Scholar]
- Difallah DE, Catasta M, Demartini G, Iperiotis PG & Cudre-Mauroux P 2015. The dyanmics of micro-task crowdsourcing: The case of Amazon MTurk. Proceedings of the 12th International Conference on World Wide Web.
- Fey W, Moggi F, Rohde KB, Michel C, Seitz A & Stein M 2016. Development of stimulus material for resarch in alcohol use disorders. International Journal of Psychiatric Research, 26, e1527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gache P, Michaud P, Landry U, Accietto C, Arfaoui S, Wegner O & Daeppen JB 2005. The Alcohol Use Disorders Identification Test (AUDIT) as a screening tool for excessive drinking in primary care: Reliability and validity of a French Version. Alcohol Clin Exp Res, 29, 2001–2007. [DOI] [PubMed] [Google Scholar]
- GBD-AC, G. B. o. D. A. C. 2018. Alcohol use and burden for 195 countries and territories 1990–2016: A systemic analysis for the Global Burden of Disease Study 2016. The Lancet, 392, P1015–1035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grüsser SM, Heinz A & Flor H 2000. Standardized stimuli to assess drug craving and drug memory in addicts. Journal of Neural Transmission, 107, 715–720. [DOI] [PubMed] [Google Scholar]
- Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N & Conde JG 2009. Research electronic data capture (REDCap) - A metadata-driven methodology and workflow process for providing translational research informatics suport. J Biomed Inform. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kreusch F, Vilenne A & Quertemont E 2013. Response inhibition towards alcohol-related cues using an alcohol go/no-go task in problem and non-problem drinkers. Addiction Behavior, 38, 2520–2528. [DOI] [PubMed] [Google Scholar]
- Krueger KM 2015. Cue reactivity to images in alcohol: Creation of a standardized picture set. Master of Science, Syracuse University. [Google Scholar]
- Lang PJ, Bradley MM & B.N. C 2015. The International Affective Picture System (IAPS) in the study of emotion and attention: Technical manual and affective ratings. In: Psychophysiology, T. C. f. R. a. (ed.). University of Florida, Gainsville. [Google Scholar]
- Lee E, Namkoong K, Lee CH, An SK & Lee BO 2006. Differences in photographs inducing craing between alcoholics and non-alcoholics. Yonsei Medical Journal, 47, 491–497. [DOI] [PMC free article] [PubMed] [Google Scholar]
- López-Caneda E & Carbia C 2018. The Galician Beverages Picture Set (GBPS): A standardized database of alcohol and non-alcohol images. Drug and Alcohol Dependence, 184, 42–47. [DOI] [PubMed] [Google Scholar]
- Lovett DE, Ham LS & Veilleux JC 2015. Psychometric evaluation of a standar set of alcohol cue photographs to assess craving. Addictive Behaviors, 48, 58–61. [DOI] [PubMed] [Google Scholar]
- Nees F, Diener C, Smolka MN & Flor H 2012. The role of context in the processing of alcohol-relevant cues. Addiction Biology, 17, 441–451. [DOI] [PubMed] [Google Scholar]
- Pronk T, van Deursen DS, Beraha EM, Larsen H & Weirs RW 2015. Validation of the Amersterdam Beverage Picture Set: A controlled picture set for cognitive bias measurement and modifcaiton paradigms. Alcoholism: Clinical and Experimental Research, 39, 2047–2055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Silvertown J 2009. A new dawn for citizen science. Trends in Ecology & Evolution, 24, 467–471. [DOI] [PubMed] [Google Scholar]
- Staugger CS, Dobberteen L & Wooley JD 2017. American Alcohol Photo Stimuli (AAPS): A standardized set of alcohol and matched non-alcohol images. The American Journal of Drug and Alcohol Abuse, 43. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


