Skip to main content
Journal of Studies on Alcohol and Drugs logoLink to Journal of Studies on Alcohol and Drugs
. 2012 Sep;73(5):834–838. doi: 10.15288/jsad.2012.73.834

Innovative Recruitment Using Online Networks: Lessons Learned From an Online Study of Alcohol and Other Drug Use Utilizing a Web-Based, Respondent-Driven Sampling (webRDS) Strategy

José A Bauermeister a,*, Marc A Zimmerman a, Michelle M Johns a, Pietreck Glowacki b, Sarah Stoddard a, Erik Volz a
PMCID: PMC3410951  PMID: 22846248

Abstract

Objective:

We used a web version of Respondent-Driven Sampling (webRDS) to recruit a sample of young adults (ages 18–24) and examined whether this strategy would result in alcohol and other drug (AOD) prevalence estimates comparable to national estimates (National Survey on Drug Use and Health [NSDUH]).

Method:

We recruited 22 initial participants (seeds) via Facebook to complete a web survey examining AOD risk correlates. Sequential, incentivized recruitment continued until our desired sample size was achieved. After correcting for webRDS clustering effects, we contrasted our AOD prevalence estimates (past 30 days) to NSDUH estimates by comparing the 95% confidence intervals of prevalence estimates.

Results:

We found comparable AOD prevalence estimates between our sample and NSDUH for the past 30 days for alcohol, marijuana, cocaine, Ecstasy (3,4-methylenedioxymethamphetamine, or MDMA), and hallucinogens. Cigarette use was lower than NSDUH estimates.

Conclusions:

WebRDS may be a suitable strategy to recruit young adults online. We discuss the unique strengths and challenges that may be encountered by public health researchers using webRDS methods.


Online interactions have created new venues to collect psychosocial and behavioral data (Strecher, 2007; Van Gelder et al., 2010). Internet use in the United States among young adults between the ages of 18 and 29 years was close to 95% in 2011 (Pew Internet and American Life Project, 2011). Online data collection is fast and cheap, reduces participant burden, and provides flexibility in measuring responses with complex skip patterns. Nonetheless, several methodological concerns may threaten the internal and external validity of web-based studies, including low response rates and inadequate generalizability (Pequegnat et al., 2007).

Researchers have sought to offset these limitations by developing strategies that parallel face-to-face recruitment strategies, including online snowball sampling and time-venue sampling of chat rooms. These strategies, however, are costly and labor intensive, depend on participants’ presence in specific sites at a given time, and may limit access to segments of the population. Respondent Driven Sampling (RDS; Heckathorn, 1997) may improve sample recruitment and generalizability, particularly among hidden or stigmatized populations. RDS uses chain referrals with structured incentives as a recruitment strategy and statistically adjusts for chain-referral bias by weighing the data using participants’ social-network parameters (Salganik and Heckathorn, 2004). Researchers who have used RDS to recruit drug-using populations (Ramirez-Valles et al., 2005; Wang et al., 2007), for example, have reported greater diversity in their samples and have argued that their samples provide a closer representation of the population. On the other hand, Goel and Salganik (2010) noted that, when compared with simple random sampling, RDS may not be an optimal strategy for public health surveillance given the larger variability noted in RDS estimates.

The applicability of RDS as an online sampling strategy remained untested until recently. Wejnert and Heckathorn (2008) found that an adapted web-based RDS (webRDS) sampling was feasible and effective among college students, with referral chains progressing up to 20 times faster than traditional RDS. To date, the applicability of webRDS for public health research has not been studied. Given the existing limitations in sampling and characterizing young adults’ networks and the suitability of webRDS to recruit young participants, we examined whether webRDS could be used to recruit young adults into a web survey designed to assess their alcohol and other drug (AOD) use. To examine the adequacy of webRDS to obtain estimates approximating population parameters, we contrasted our AOD estimates to those obtained through a random household sample by the Substance Abuse and Mental Health Services Administration's (2010) National Survey on Drug Use and Health (NSDUH; formerly the National Household Survey on Drug Abuse).

Method

Sample and recruitment

The data came from an observational study examining young adults’ relationships online (Virtual Networks Study). To be eligible, youth had to be 18–24 years old, live in the United States, and have access to the Internet. We used webRDS to recruit participants. The first wave of participants (i.e., seeds) was selected based on race/ethnicity and region of the United States to ensure that initial networks were diverse and that we would not concentrate recruitment in a single geographic region. Seeds were recruited through a targeted online Facebook advertisement. After completing the study's eligibility screener, the seeds provided their contact information. We called eligible participants and, if they filled a vacancy in our race/region matrix, provided them with a link and password to the web questionnaire. Phone screening allowed us to (a) verify that participants were not automated “bots” programmed by hackers to respond to online advertising and (b) answer any questions from the seed participants. We recruited 22 seeds, with a diverse racial and regional composition (5 Black/African American, 8 Latino[a]/Hispanic, 9 White/European American; 7 from the Northeast, 6 from the South, 4 from the West, and 5 from the Midwest). The remainder of our sample (n = 3,426) was recruited through seed referral.

Procedure

Study data were protected with a 128-bit SSL encryption and kept on a secure, firewalled server at the University of Michigan. Each prospective participant logged into the survey portal using their unique identifying number, created an account using a personal email address, and filled out the eligibility screener. Eligible participants read and consented to the study and completed the survey assessing their sociodemographic characteristics, Internet use, lifetime and recent AOD use, lifetime and recent sexual behaviors, and other AOD correlates (e.g., mental health, peer and parent AOD use). Respondents averaged 37 minutes to complete the questionnaire. Participants were also asked if they would share their Facebook social-network data with the research team, and a third of participants agreed (n = 1,146).

The youths received a $20 incentive for their participation and were offered an additional $10 for each young adult who completed the survey from their referral. Incentives were paid with a VISA e-gift card, which was reloaded as their referrals completed the survey. Our first lesson learned came from our experience in the referral process. Initially, seeds were asked to enter the email of two friends at the end of the survey. We sent a computer-generated email containing a link to the survey to these referrals. As recommended (Wejnert and Heckathorn, 2008), invitations were sent 24 hours after survey completion to stabilize recruitment and give participants time to let peers know that they would be receiving email from the study team. Only two referrals enrolled in the study, raising concerns that the automated messages were being caught in spam filters or deleted as junk. Consequently, we sent each seed a referral email that they could forward to friends who might be interested. We expected that referrals might be less likely to delete a message from someone familiar and more likely to use the link provided. Invitation emails included a unique identifying number for the referral. The emails also helped to (a) reduce threats to a potential young adult's confidentiality and privacy and (b) reduce concerns that referral chains were being broken as a result of filtering of our email invitations. Nevertheless, these corrections did not increase referrals.

In a final attempt to increase referrals, we telephoned seeds and asked them about their experiences using the referral emails. Overall, seeds mentioned that they had never forwarded the email and had not told their referrals they had invited them to participate. When asked how to improve our webRDS strategy, they suggested that having the ability to refer more friends and, as a result, increasing the total incentive possible would increase their motivation to refer peers. Consequently, we increased the number of paid referrals to five (i.e., now earning $50 for referring friends) and allowed participants to copy and paste the link for their unique identifying number into instant messages, text messages, and/or social-network sites (e.g., Facebook). This revised procedure immediately re-energized recruitment, starting vertical growth of our referral chains.

Given that multiple individuals could begin the survey at the same time using a specific unique identifying number, we did not foresee that some chains could have as many as 30 people in the second generation. To keep the chains growing vertically and not reach our target sample size through rapid horizontal growth, each unique identifying number could be used only up to10 times. If more than five referrals completed the survey, we allowed the first five who completed the survey to refer their peers; the remaining participants were thanked and compensated for completing the questionnaire but were not asked to refer friends.

Survey data were checked daily to screen out duplicate and fraudulent cases (n = 675; 16% of all completed entries received) in an effort to preserve data quality (Bauermeister et al., in press). Duplicate and fraudulent cases were not allowed to refer others into the study. From our first completed referral until the last participant, data collection took 2.5 months.

Measures

Recruitment network characteristics.

Following prior recommendations (Wejnert and Heckathorn, 2008), young adults were asked to indicate who had referred them to the study. Participants were also asked a series of staggered questions regarding the number of young adults whom they knew, including the number of young adults who lived in their area, followed by how many of these they could contact and had interacted with in the past 3 months. We asked participants to estimate, among those with whom they had interacted in the past 3 months, how many they had communicated with online as well as the racial/ethnic composition of these social-network contacts. Finally, we asked participants to estimate how many of their friends were of their same race/ethnicity. We used these data to develop the statistical weight (RDS2) needed to account for the intraclass correlation resulting from the network-referral procedures in our analyses (Volz and Heckathorn, 2008). Gile (2011) compared several competing sample weights and found that our RDS2 method may be a stable and accurate estimator.

Alcohol and other drug use.

Responding to wording similar to that found in Monitoring the Future (Johnston et al., 2011), participants indicated their lifetime substance use in the survey. Substances comprised alcohol, cigarettes, marijuana, cocaine, Ecstasy (3,4-methylenedioxymethamphetamine, or MDMA), hallucinogens, and nonprescription drugs, among others. Among lifetime substances selected, participants then indicated their frequency of use in the past 30 days (never, once a month or less, two to three times a month, about once a week, two to six times a week, about once a day, more than once a day). Participants who had never used a substance were coded as “never.”

Sociodemographic characteristics.

Participants reported their sex, race/ethnicity, and education. We calculated participants’ age by subtracting their month and year of birth from the date of study participation. Participants also indicated their zip code and state, which we collapsed into census regions.

Data analytic strategy

We computed the RDS2 statistical weight to correct for the intraclass correlation resulting from the network-referral procedures. We then examined whether webRDS approximated national prevalence estimates of substance use in lifetime and past 30 days, computing prevalence percentages and 95% confidence intervals (CIs) using SPSS Version 18 (SPSS Inc., Chicago, IL).

Results

Sample description

As shown in Table 1, the final sample (n = 3,448) was evenly split by sex, averaged 20.8 years (SD = 1.76), and ranged across the four regions of the United States (47 states were represented). The final sample was 70.4% White/European American, 4.9% Black/African American, 11.8% Asian/Pacific Islander, 8.4% Hispanic/Latino(a), 0.8% Native American/Alaska Native, and 3.7% other. The majority of participants had some college education.

Table 1.

Sample description for unweighted and weighted data

Variable Virtual network (unweighted) (n = 3,448) % Virtual network (with RDS2 weight) (n = 829) % 2010 U.S. Census 18–24 years %
Sex
 Male 51.6 50.9 51.1
 Female 48.4 49.1 48.9
Race/ethnicity
 White 70.4 69.7 72.4
 Black 4.9 5.0 12.6
 Hispanic/Latino(a) 8.4 8.9 16.3
 Asian/Pacific Islander 11.8 11.6 4.8
 Native American/Alaska Native 0.8 1.0 0.9
 Other 3.7 3.6 6.20
Educational attainment
 Less than 9th grade 0.1 0.1 1.9
 9th to 12th grade, no diploma 3.7 4.2 17.7
 High school graduate 23.1 23.8 29.7
 Technical/associate degree 3.9 4.4 5.1
 Some college, no degree 49.8 48.4 36.5
 Bachelor's degree 18.2 18.0 8.5
 Graduate or professional degree 1.1 1.0 0.7
Region
 Northeast 36.3 35.9 17.7
 Midwest 23.4 23.1 21.4
 South 28.5 29.4 37.0
 West 11.8 11.5 23.9

Note: RDS = Respondent-Driven Sampling.

Alcohol and other drug prevalence estimates

Our sample reported past-30-day AOD use comparable to that found in young adults in the NSDUH. After RDS design effects were adjusted for, more than half of the sample reported alcohol use in the past month (62.4%, CI [59.1, 65.7]), closely approximating the NSDUH estimate (61.8%, CI [61.1, 62.5]). Slightly more than one fifth of young adults reported using marijuana in the past month (21.4%, CI [18.6, 24.2]), similar to NSDUH estimates (18.1%, CI [17.6, 18.6]). Cocaine use over the past month was comparable between our sample (1.2%, CI [0.05, 2.3]) and the NSDUH estimate (1.4%, CI [1.2, 1.6]). We also found comparable estimates in Ecstasy use between our sample (1.5%, CI [0.07, 2.3]) and NSDUH participants (1.1%, CI [1.0, 1.2]) and in hallucinogen use (2.6%, CI [1.5, 3.7]) in our sample compared to NSDUH (1.8%, CI [1.6, 2.0]). Nonprescription drug use (4.5%, CI [3.1, 5.9]) approximated that reported in NSDUH (6.3%, CI [6.0, 6.6]). We found lower rates of cigarette use (19.9%, CI [17.1, 22.6]) than NSDUH (35.8%, CI [35.1, 36.5]).

Discussion

The increased popularity of online interactions as a form of day-to-day communication has created opportunities to collect data using web-based surveys. Although RDS was originally designed to recruit hard-to-reach populations without a population sampling frame through faceto-face peer exchanges, researchers suggest that webRDS may be used to recruit representative samples online (Wejnert and Heckathorn, 2008). We examined whether web-RDS facilitated the recruitment of a representative sample of young adults in the United States and provided comparable substance use estimates to those offered by NSDUH. We found comparable estimates for alcohol, marijuana, cocaine, Ecstasy, hallucinogens, and nonprescription drugs.

Our sample, however, reported less cigarette use than the NSDUH sample. This difference may be attributable to our participants’ educational attainment. Rates of cigarette use among youth who do not complete their secondary education are higher than those who complete high school (Substance Abuse and Mental Health Services Administration, 2010). Whereas NSDUH includes a nationally representative sampling frame that oversamples individuals with lower educational attainment, most of our sample had completed high school. Conversely, we found comparable cigarette use estimates between our sample (19.9%) and a national high school sample (e.g., 19.2% in Monitoring the Future; Johnston et al., 2011).

Compared with traditional RDS and other recruitment modes (e.g., face to face), webRDS overcame physical and temporal barriers that helped expedite recruitment. Rather than having to refer a peer in person, for example, young adults could refer via wall post, status update, or message to their online social network. Furthermore, the asynchronous nature of online communication was an added recruitment advantage compared with traditional approaches because participants could invite peers even if these individuals were not online at the same time. Although the ability to recruit peers through passive (e.g., mass advertisements) or active (e.g., personalized message) recruitment strategies may encourage participants to invite peers using an approach that is most comfortable to them, we were unable to test whether participants differed based on how they had learned about the study. Future research examining whether passive and active referral strategies influence recruitment may be warranted. Based on our qualitative interviews with seed participants early in the study, however, it seems that offering multiple approaches to recruitment may be the most effective approach.

It is vital to acknowledge the persistent racial/ethnic and socioeconomic disparities in computer access and use frequency. The Pew Internet and American Life Project (2011), for example, suggests that most youth in the United States are able to access the Internet daily. Nevertheless, large racial/ethnic disparities exist regarding where, when, and for how long people can stay online. Non-White minority individuals may not have home access to the Internet (Smith, 2010), which may limit when and for how long they sign in. As a result, it was not surprising that we had fewer participants at lower education levels or who self-identified as Hispanic/Latino(a) and/or African American/Black American. In the presence of these disparities, it is plausible that technologically disadvantaged young adults accessed email and social-networking applications less often than those with consistent access and, as a result, were less likely to follow up on peers’ invitations to participate in the study. Researchers seeking to use webRDS may need to carefully consider how to overcome these disparities, by creating mechanisms that allow for the sampling of subgroups using the Internet less frequently than majority groups. Researchers, for example, may consider oversampling technologically disadvantaged individuals as seeds and/or allocating special coupons that take longer to expire if referrers are technologically disadvantaged. Further, PEW data suggest that the digital divide is smaller for smart phones (Smith, 2010); therefore, the expansion of webRDS and web surveys to smart phone platforms may be warranted.

WebRDS requires attention to several factors not present in traditional survey methods. The referral system creates exponential recruitment. Whereas the recruitment in other sampling designs may be consistent over time, researchers need to prepare for an exponential growth in their recruitment timetable. Staff efforts need to be allocated accordingly, particularly if the data are being verified for duplicitous or falsified entries.

Finally, researchers using webRDS need to consider the budget implications of this sampling plan. When we accounted for the network-based referral design, we found a reduction in sample size similar to that found in prior work carried out by Salganik (2006). The reduction in sample size may affect the proposed statistical power and quickly increase study costs. We averaged a $25.98 cost for each participant in the total sample (n = 3,448); however, that cost went up to $108 per participant with the RDS2 adjustment (n = 829). Nevertheless, this cost is estimated using our revised proposed study incentives, which were determined by feedback obtained from study seeds. Costs per participant could decrease if respondents are comfortable receiving a smaller incentive for completing the survey and/or referring peers. Future research examining how varying incentive amounts affect participants’ motivation to refer peers is warranted.

Web- and mobile phone–based sampling strategies will be necessary for survey research in the 21st century. As traditional survey methods become increasingly difficult (e.g., fewer land lines) and online communications become increasingly ubiquitous, researchers will need to develop strategies that achieve representative samples using online methods. Taken together, our findings suggest that webRDS may be a suitable and timely online recruitment strategy for young adults. However, several lessons learned may be shared based on our findings:

  • (a) provide enough incentives to encourage referral motivation;

  • (b) implement procedures to ensure vertical—rather than horizontal—growth in referral networks;

  • (c) collect network data to improve weighting algorithms (e.g., RDS2);

  • (d) oversample seeds from subgroups that may be underrepresented to promote their inclusion in the study; and

  • (e) consider participants’ access to web-accessible computers and perhaps develop smart phone applications to increase accessibility.

We encourage other scholars to replicate our approach as well as examine the adequacy of webRDS in reaching more specialized populations. Finally, research comparing webRDS with other sampling modes and other study variables (e.g., social determinants of health) is warranted.

Footnotes

This research was supported by Research Challenge Grant 5RC-1DA028061-02 from the National Institute of Drug Abuse to Marc A. Zimmerman, principal investigator. José A. Bauermeister is supported by Career Development Award 1K01MH087242 from the National Institutes of Mental Health.

References

  1. Bauermeister JA, Pingel E, Zimmerman MA, Couper MP, Carballo-Dieguez A, Strecher V J. Data quality in web-based HIV/ AIDS research: Handling invalid and suspicious data. Field Methods. doi: 10.1177/1525822X12443097. Retrieved from http://fmx.sagepub.com. [DOI] [PMC free article] [PubMed]
  2. Gile KJ. Improved inference for respondent-driven sampling data with application to HIV prevalence estimation. Journal of the American Statistical Association. 2011;106:135–146. [Google Scholar]
  3. Goel S, Salganik MJ. Assessing respondent-driven sampling. Proceedings of the National Academy of Sciences of the United States of America. 2010;107:6743–6747. doi: 10.1073/pnas.1000261107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Heckathorn DD. Respondent-driven sampling: A new approach to the study of hidden populations. Social Problems. 1997;44:174–199. [Google Scholar]
  5. Johnston LD, O'Malley PM, Bachman JG, Schulenberg JE. Monitoring the Future national survey results on drug use, 1975–2010. Volume I: Secondary School Students. Ann Arbor, MI: Institute for Social Research, The University of Michigan; 2011. [Google Scholar]
  6. Pequegnat W, Rosser BRS, Bowen AM, Bull SS, DiClemente R, Bockting J, Zimmerman WO. Conducting Internet-based HIV/STD prevention survey research: Considerations in design and evaluation. AIDS and Behavior. 2007;11:505–521. doi: 10.1007/s10461-006-9172-9. [DOI] [PubMed] [Google Scholar]
  7. Pew Internet and American Life Project. Demographics of Internet users. 2011. Retrieved from http://www.pewinternet.org/Static-Pages/Trend-Data-(Adults)/Whos-Online.aspx.
  8. Ramirez-Valles J, Heckathorn DD, Vázquez R, Diaz RM, Campbell RT. From networks to populations: The development and application of respondent-driven sampling among IDUs and Latino gay men. AIDS and Behavior. 2005;9:387–402. doi: 10.1007/s10461-005-9012-3. [DOI] [PubMed] [Google Scholar]
  9. Salganik MJ. Variance estimation, design effects, and sample size calculations for respondent-driven sampling. Journal of Urban Health, 83, Supplement. 2006;1:98–112. doi: 10.1007/s11524-006-9106-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Salganik MJ, Heckathorn DD. Sampling and estimation in hidden populations using respondent-driven sampling. Sociological Methodology. 2004;34:193–240. [Google Scholar]
  11. Smith A. Technology trends among people of color. Pew Internet and American Life Project. 2010. Retrieved from http://www.pewinternet.org/Commentary/2010/September/Technology-Trends-Among-People-of-Color.aspx.
  12. Strecher V. Internet methods for delivering behavioral and health-related interventions (eHealth) Annual Review of Clinical Psychology. 2007;3:53–76. doi: 10.1146/annurev.clinpsy.3.022806.091428. [DOI] [PubMed] [Google Scholar]
  13. Substance Abuse and Mental Health Services Administration. Results from the 2009 National Survey on Drug Use and Health: Volume I. Summary of National Findings. Rockville, MD: Office of Applied Studies; 2010. [Google Scholar]
  14. van Gelder MMHJ, Bretveld RW, Roeleveld N. Web-based questionnaires: The future in epidemiology? American Journal of Epidemiology. 2010;172:1292–1298. doi: 10.1093/aje/kwq291. [DOI] [PubMed] [Google Scholar]
  15. Volz E, Heckathorn DD. Probability based estimation theory for respondent driven sampling. Journal of Official Statistics. 2008;24:79–97. Retrieved from http://www.jos.nu/Articles/abstract.asp?article=241079. [Google Scholar]
  16. Wang J, Falck RS, Li L, Rahman A, Carlson RG. Respondent-driven sampling in the recruitment of illicit stimulant drug users in a rural setting: Findings and technical issues. Addictive Behaviors. 2007;32:924–937. doi: 10.1016/j.addbeh.2006.06.031. [DOI] [PubMed] [Google Scholar]
  17. Wejnert C, Heckathorn DD. Web-based network sampling: Efficiency and efficacy of respondent-driven sampling for online research. Sociological Methods & Research. 2008;37:105–134. [Google Scholar]

Articles from Journal of Studies on Alcohol and Drugs are provided here courtesy of Rutgers University. Center of Alcohol Studies

RESOURCES