Abstract
Use of internet panels to collect survey data is increasing because it is cost-effective, enables access to large and diverse samples quickly, takes less time than traditional methods to obtain back for analysis, and the standardization of data collection process makes studies easy to replicate. A variety of probability-based panels have been created including Telepanel/CentERpanel, Knowledge Networks (now GFK KnowledgePanel®), the American Life Panel, the LISS Panel, and the Understanding American Study panel. Despite the advantage of having a known denominator (sampling frame), the probability-based internet panels often have low recruitment participation rates and some have argued that there is little practical difference between opting out of a probability sample and opting into a non-probability (convenience) internet panel. This paper provides an overview of both probability-based and convenience panels, discussing potential benefits and cautions for each method, and summarizing approaches used to weight panel respondents to better represent the underlying population. Challenges in using internet panel data such as false answers, careless responses, giving the same answer repeatedly, getting multiple surveys from the same respondent, and panelists being members of multiple panels are discussed. There is more to be learned about internet panels generally and web-based data collection along with opportunities to evaluate data collected using mobile devices and social media platforms.
Keywords: internet panels, survey research
Baker et al. (2010) referred to the growth of panels for data collection as “one of the most compelling stories of the last decade” (p. 715) Use of internet panels to collect survey data is increasing because it is cost-effective, enables access to large and diverse samples quickly, takes less time than traditional methods to get data back for analysis, and the standardization of data collection process makes studies easy to replicate. Internet panels for research can be traced back to Willem Saris (http://en.wikipedia.org/wiki/Willem_Saris), a sociology professor at the University of Amsterdam.
Probability-based Internet Panels
When the first modems came on the market in 1985, Saris realized that one could create a computer assisted data collection system without interviewers. Saris and De Pijper (1986) developed a working system for this purpose, the so called Telepanel. In that time a random sample of the population was provided with home computers and modems and eventually a telephone connection if it was not in the house. The experiences with this first system for interviewing without interviewers before the web existed, are summarized by Saris (1998). This system was bought by the Dutch Gallup organization that developed the first nationwide computer based panel for data collection which started to function in 1986. In 1991 the University of Amsterdam started, with support of the Dutch Science Foundation (NWO), a larger panel (about 3,000 individuals). This panel was after 5 years taken over by the Tilburg University Center for Economic Research. That panel, CentERpanel, is the oldest academic internet probability panel in the world.
A variety of other probability-based internet panels were created following the success of the panel initiated by Saris. In 1999, Knowledge Networks (now GFK KnowledgePanel®) created a panel of about 55,000 individuals using address-based sampling for recruitment:.http://www.gfk.com/us/Solutions/consumer-panels/Pages/GfK-KnowedlgePanel.aspx. Advances in methods to improve representation of cell phone numbers when conducting random digit dialing have appeared (Hu, Pierannunzi, & Balluz, 2011; Voigt, Schwartz, Doody, Lee, & Li, 2011). But the GFK KnowledgePanel® is recruited using address-based sampling from the U.S. Postal Service’s Computerized Delivery Sequence File, described as “essentially a complete list of all US residential addresses, including those that are cell phone only and often missed in RDD sampling” http://www.gfk.com/us/Solutions/consumer-panels/Pages/GfK-KnowledgePanel.aspx?gclid=CML0_t2Qy8ICFUmUfgodUIsAHg.
The Longitudinal Internet Studies for the Social Sciences (LISS) in the Netherlands used population registry-based sampling and recruited face-to-face and by telephone to obtain a panel of about 7,500 individuals in 2007: http://www.lissdata.nl/lissdata/. In the year before, the American Life Panel was developed in the U.S. This panel includes approximately 6,000 adults recruited by random-digit dialing, face-to-face, and address-based sampling: http://mmicdata.rand.org/alp/. More recently, the Understanding American Study panel of about 2,000 individuals was created using address-based sampling: http://cesr.usc.ed/?page=UAS.
Despite the advantage of having a known denominator (sampling frame), the probability-based internet panels tend to have low recruitment participation rates. About 6–7% of the targeted Knowledge Networks panel respondents were excluded because they were not in the service area of a WebTV internet service provider. Addresses were obtained for about 60% of the sampled telephone numbers. About 89% of those in the eligible random-digit dial sample were contacted for initial telephone interviews, 56% agree to participate in the initial telephone interview and join the panel, and 72% of those installed the required WebTV device in their homes (Chang & Krosnick, 2009). Similarly, about 14% of eligible KnowledgePanel® households state a willingness to become a panel member and of those about three quarters follow through (http://www.knowledgenetworks.com/accuracy/spring2010/disogra-spring10.html), for a net sign-up rate of about 10%. Also, the Understanding America Study achieves a sign-up rate of 15–20% (recruitment is still ongoing).
The highest recruitment participation rate has been achieved by the LISS panel in the Netherlands, 48%. The recruiting procedure is based on drawing households from the national population registry and next following a process of initial invitation letters with an included prepaid incentive, follow-up phone calls as well as face-to-face recruiting (Scherpenzeel & Das, 2011). The adopted process and incentive levels are based on an initial experiment aimed at maximizing the recruitment participation rate. (Scherpenzeel & Toepoel, 2012).
Non-probability-based (convenience) Internet Panels
Some argue that there is little practical difference between opting out of a probability sample and opting into a non-probability sample (Rivers, 2013). Indeed, there are a plethora of internet panel vendors that rely on non-probability based recruitment and many researchers use those panels. For example, the NIH Toolbox project developed a multidimensional set of brief measures assessing cognitive, emotional, motor and sensory function from ages 3 to 85 years old. The study participants were part of the Delve, Inc. panel assembled using online self-enrollment, enrollment through events hosted by Delve, and telephone calls from market research representatives (Gershon et al. 2013).
The composition of these non-probability internet (convenience) panels is known to differ from the underlying population. It is estimated that up to 1/3 of the U.S. adult population does not use the internet on a regular basis (Baker et al., 2010). Panel members tend to be more educated and have higher socioeconomic status than non-panel members (Craig et al., 2013). Response rates for members of convenience panels tend to be low. Baker et al. (2010) suggested that response rates are often 10% or lower. As a result, many users of convenience panels utilize a quota sampling approach by targeting respondents with particular demographic and other characteristics and use post-stratification adjustments (weights) to compensate for non-coverage and non-response. Panel respondents are weighted to match a target marginal distribution (e.g., U.S. census).
An analysis by Schonlau, van Soest, Kapteyn and Couper (2009) of 11,279 individuals 55 and older in the 2002 Health and Retirement Study showed that the 30% reporting internet access differ substantially in characteristics from those without internet access (see Table 1). Propensity score weights were created by predicting internet access from race/ethnicity, gender, education, age, marital status, income, owns house, and self-rated health. The weighted estimates were more similar to the underlying population but there were still non-trivial differences. Yeager et al. (2011) concluded that probability sample surveys were consistently more accurate than non-probability sample surveys, even after post-stratification weighting of the data.
Table 1.
Full Sample | Internet Access | Weighted Internet Access | |
---|---|---|---|
High blood pressure | 55% | 44% | 52% |
Depressed | 19% | 11% | 15% |
Difficulty dressing | 9% | 4% | 7% |
Difficulty walking several blocks | 31% | 15% | 27% |
Note: Data is from Schonlau et al. (2009).
A recent study comparing responses to the PROMIS global health items across four surveys found comparable estimates of physical and mental health despite differences in survey sampling (probability vs. non-probability), although the National Health Interview Survey yielded more positive estimates of health due to the interview mode of data collection (Riley, Hays, Kaplan, & Cella, 2014). Chang and Krosnick (2009) found that non-probability internet data collection yielded the most accurate self-reports from the most biased sample but the probability internet sample displayed the best combination of sample composition and self-report accuracy.
Approaches to Weighting Convenience Internet Panels
An example of a successful use of a convenience panel to represent the “general population” is the initial Patient-Reported Outcomes Measurement Information System (PROMIS®) data collection and weighting described by Liu et al. (2010). The study team set target quotas from the Polimetrix (now YouGov) convenience panel of over 1 million members: 50% female, 20% from each of five age groups (18–29, 30–44, 45–59, 60–74, and 75 and older), 12% African-American, 12% Hispanic, and 10% less than high school education. The demographics of the resulting respondents versus the 2000 Census are shown in Table 2. The PROMIS sample had a greater percentage of females, and was much more educated and a little older than the U.S. general population (2000 Census). Post-stratification adjustment (analytic weights) was used to compensate for non-response and un-equal selection probability. The PROMIS sample was weighted to have the same distribution on six demographic variables (gender, age, race/ethnicity, education, marital status and income) using an iterative proportional fitting or raking method. Raking matches cell counts to the marginal distributions through cell-by-cell-adjustments repeated until there is convergence between the weighted sample and the U.S. Census distributions.
Table 2.
2000 Census | PROMIS Sample | Weighted PROMIS Sample | |
---|---|---|---|
% Female | 52 | 55 | 52 |
% Hispanic | 11 | 13 | 11 |
% African-American | 11 | 10 | 11 |
% Less than high school | 20 | 3 | 20 |
% High school/GED graduates | 29 | 19 | 29 |
% greater than high school | 51 | 78 | 51 |
Mean age | 45 | 59 | 45 |
Table 2 shows that the weighted PROMIS sample was similar to the U.S. 2000 Census on the demographic characteristics. Mean scores on self-rated general health (“In general, how would you rate your health? 5 = excellent, 4 = very good, 3 = good, 2 = fair, 1 = poor) for the PROMIS weighted sample was 3.42 compared to 3.56, 3.50 and 3.52 for the 2004 Medical Expenditure Panel Survey, 2001–2002 National Health and Nutrition Examination Survey, and the 2005 Behavioral Risk Factor Surveillance System, respectively.
Although one can leverage the distribution of demographics of a sample to the target population, weighting of convenience samples does not always yield complete comparability of outcome measures to a target population. In the extreme, responses from convenience panels may differ so much from the target population that no adjustment can make them look similar. For example, the PROMIS 2010 re-centering project collected data from members of the OP4G convenience panel (n = 2,996) that had similar demographic characteristics as the 2010 Census, but respondents reported worse health by about a half-standard deviation on PROMIS domains compared to the PROMIS wave 1 general population sample. The weighted mean on the self-rated health item mentioned above (“In general, how would you rate your health? 5 = excellent, 4 = very good, 3 = good, 2 = fair, 1 = poor) was 3.24 compared to the weighted mean of 3.42 observed for the PROMIS wave 1 sample (noted above). Similarly, the OP4G sample had an average Health Utilities Index (HUI-3) score of only 0.54 while the median HUI-3 in the U.S. non-institutionalized population 35–89 years old was estimated as 0.88 using random digit dialing (Fryback et al., 2007). The telephone mode of data collection in the Fryback et al. (2007) study yields HUI-3 scores about 0.10 higher than mail (Hays et al., 2009), but the mode effect cannot account for the much lower HUI-3 scores for the OP4G sample.
The PROMIS project also collected data from a sample of 640 adult Spanish-speaking Latinos in the Toluna internet panel and found that only 2% selected Spanish as their language of preference and they reported higher levels of education and lower levels of acculturation than the 2010 Census data for Latinos (Paz, Spritzer, Morales, & Hays, 2013). It is unlikely with the given characteristics and sample size that weighting these data would produce compatible marginal distributions of health matching that of the U.S. general Spanish population.
Challenges in Using Internet Panels
Data integrity is a concern when dealing with data collected from internet panels. Respondents may engage in a variety of less than optimal strategies to get through surveys so they can get whatever rewards or incentives are offered. This can lead to a variety of undesirable responses such as false answers, answering too fast, giving the same answer repeatedly (also known as straight-lining or satisficing), and getting multiple surveys completed by the same respondent. To help improve the quality of the data, Liu et al. (2010) excluded respondents with high levels of missing data (e.g., completed less than half of the items), who completed items quicker than one second per item, or who gave the same response to 10 consecutive items. Panel companies often have procedures in place such as email address and IP address verification to ensure the identity of individuals that join and minimize duplicate representation on the panel. Another practice is to provide feedback to respondents who appear to be less than serious in responding to questions, for example, by noting that they are rushing through surveys or that they often seem to give the same answer.
Another issue to confront is the fact that panelists of convenience panels on average belong to 2.7 panels (Tourangeau, Conrad, & Couper, 2013). Indeed, Miller (2006) estimated that 30% of internet surveys are completed by 0.25% of the U.S. eligible population. A study that recruited U.S. adults from 7 panel vendors using identical quotas found variability in response rates and estimated that different panel vendors appeared to draw 15–25% of their samples from a common pool (Craig et al., 2013).
Conclusions and Future Study
Whether panels (convenience or probability-based) represent the underlying population is not a concern unless the research project needs precise estimates of population values or unbiased estimates of relationships between variables of interest (although associations are typically not as affected). When the objectives of the study are different, then use of panels to select samples is similar to a large body of research based on undergraduates, patients receiving care at select sites of care, or samples that are not representative of a true underlying population. For these purposes, the use of convenience panels has the advantages of relatively low cost, speed of data collection, and ability to obtain large numbers of respondents in subgroups of interest. Similarly, methodological and psychometric research that requires a diverse but not necessarily representative sample can benefit greatly by use of internet panel data sources.
When there is value in representing a defined underlying population, then convenience internet panels may be useful if the data can be weighted to compensate adequately for coverage errors and selection bias. As described above, there are examples where even convenience internet panels can be used as the basis of population norms. But there is no guarantee that any particular convenience internet panel will be suitable for this purpose. Probability-based panels have the major advantage of having a known denominator, but the recruiting rate for these panels is often low. Chang and Krosnick (2009) compared a convenience panel (Harris Interactive) with a probability-based panel (Knowledge Networks) and concluded that “probability samples were more representative of the nation than the nonprobability sample in terms of demographics … even after weighting.” But the average errors of estimates of demographic variables compared to the 2000 Current Population Survey were actually very similar for Knowledge Networks and Harris Interactive (Table 3). This is consistent with the suggestion by Rivers (2013) that there is little practical difference between opting out of a probability sample and opting into a non-probability sample.
Table 3.
Harris Interactive | Knowledge Networks | |
---|---|---|
Education | 6% | 3% |
Income | 5% | 6% |
Age | 2% | 2% |
Race | 1% | 2% |
Gender | 0% | 1% |
Note: Weighted data reported in Table 3 of Chang and Krosnick (2009)
There are no hard and fast rules determining when convenience panels are adequate for use in population inference, or when response rates to probability Internet panels are high enough to assume unbiased estimates. For instance, bias in the estimate of a simple mean is a function of the covariance between the propensity to respond and the variable of interest, as well as the response propensity of sample members (Bethlehem, 2002). Meta-analysis suggests that the relation between response rate and bias is not very strong in most cases (Groves & Peytcheva, 2008). Gutsche, Kapteyn, Meijer, & Weerman (2014) used the American Life Panel (with a recruitment participation rate of 10–15%) to forecast the popular vote in the 2012 Presidential election. Their forecast of the final tally was among the very best of some 25 U.S. polling firms, which may suggest that the response propensity and one’s political preference was at most weakly correlated.
Survey research has entered a new era with less emphasis on interviewer-based and increasing use of new technologies for data collection (Link et al., 2014). There is more that needs to be learned about the strengths and disadvantages of probability-based and convenience internet panels as well as the use of web-based data collection in general (Bergeson, Laibson, Ehrmontraut, & Hays, 2013; Brown, Serrato, Hugh, Kanter, & Hays, submitted). There are also future opportunities to evaluate data collected using mobile devices and social media platforms.
Acknowledgments
Ron D. Hays was supported in part by grants from NCI (1U2-CCA186878-01), the NIA (P30-AG021684) and the NIMHD (P20-MD000182). Honghu Liu was supported in part by a grant from NIDA (R34-DA031643). Arie Kapteyn was supported by the NIA (R01-AG20717).
Footnotes
This paper was presented at the 2014 Society for Computers in Psychology Meeting, Long Beach, CA.
Contributor Information
Ron D. Hays, Department of Medicine, UCLA
Honghu Liu, Division of Public Health and Community Dentistry, School of Dentistry, UCLA.
Arie Kapteyn, Center for Economic and Social Research, USC.
References
- Baker R, Brick JM, Bates NA, Battaglia M, Couper MP, Dever JA, Gile KJ, Tourangeau R. Summary report of the AAPOR Task Force on non-probability sampling. Journal of Survey Statistics and Methodology. 2013;1:90–143. [Google Scholar]
- Bergeson SC, Gray J, Laibson T, Ehrmantraut LA, Hays RD. CG CAHPS®: Comparing an e-mail invitation and web-based data collection with a mailed invitation and survey. Primary Health Care: Open Access. 2013 doi: 10.4172/2167-1079.1000132. epub. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bethlehem J. Weighting nonresponse adjustments based on auxiliary information. In: Groves R, Dillman D, Eltinge R, Little L, editors. Survey Nonresponse. Wiley; New York: 2002. pp. 275–288. [Google Scholar]
- Brown JA, Serrato C, Hugh M, Kanter M, Hays RD. Effect of a post-paid incentive on response rates to a web-based survey. (submitted) [Google Scholar]
- Chang L, Krosnick JA. National surveys via RDD telephone interviewing versus the internet: Comparing sample representativeness and response quality. Public Opinion Quarterly. 2009;73:641–678. [Google Scholar]
- Craig BM, Hays RD, Pickard AS, Cella D, Revicki DA, Reeve BB. Comparison of US panel vendors for online surveys. Journal of the Medical Internet Research. 2013;15 (11):e260. doi: 10.2196/jmir.2903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fryback DG, Dunham NC, Palta M, Hanmer J, Buechner J, Cherepanov D, Herrington S, Hays RD, Kaplan RM, Ganiats TG, Feeny D, Kind P. U.S. Norms for Six Generic Health-Related Quality-of-Life Indexes from the National Health Measurement Study. Medical Care. 2007;45:1162–1170. doi: 10.1097/MLR.0b013e31814848f1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gershon RC, Wagster MV, Hendrie HC, Fox N, Cook KF, Nowinski CJ. NIH Toolbox for the Assessment of Neurological and Behavioral Function. Neurology. 2013;80 (Supplement 3):S2–S6. doi: 10.1212/WNL.0b013e3182872e5f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Groves RM, Peytecha M. The impact of nonresponse rates on nonresponse bias. Public Opinion Quarterly. 2008;72:167–189. [Google Scholar]
- Gutsche T, Kapteyn A, Meijer E, Weerman A. The RAND Continuous 2012 Presidential Election Poll. Public Opinion Quarterly. 2014;78:233–254. [Google Scholar]
- Hays RD, Kim S, Spritzer KL, Kaplan RM, Tally S, Feeny D, Liu H, Fryback DG. Effects of mode and order of administration on generic health-related quality of life scores. Value in Health. 2009;12:1035–1039. doi: 10.1111/j.1524-4733.2009.00566.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hu SS, Pierannunzi C, Balluz L. Integrating a multimode design into a national random-digit-dialed telephone survey. Prev Chronic Dis. 2011;8 (6):A145. [PMC free article] [PubMed] [Google Scholar]
- Link MW, Murphy J, Schober MF, Buskirk TD, Childs JH, Tesfaye CL. Mobile technologies for conducting, augmenting and potentially replacing surveys: Report of the AAPOR task force on emerging technologies in public opinion research. 2014. [Google Scholar]
- Liu HH, Cella D, Gershon R, Shen J, Morales LS, Riley W, Hays RD. Representativeness of the PROMIS internet panel. Journal of Clinical Epidemiology. 2010;63 (11):1169–1178. doi: 10.1016/j.jclinepi.2009.11.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller J. Online marketing research. In: Grover R, Vriens M, editors. The Handbook of Marketing Research. Thousand Oaks, California: 2006. pp. 110–131. [Google Scholar]
- Paz SH, Spritzer KL, Morales LS, Hays RD. Evaluation of the Patient-Reported Outcomes Information System (PROMIS®) Spanish Physical Functioning Items. Quality of Life Research. 2013;22 (7):1819–1830. doi: 10.1007/s11136-012-0292-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Riley W, Hays RD, Kaplan RM, Cella D. Sources of comparability between probability sample estimates and nonprobability web samples estimates. Proceedings of the 2013 Federal committee on statistical methodology (FCSM) research conference; 2014. http://www.fcsm.gov/13papers/B4_Riley_20. [Google Scholar]
- Rivers D. Comment. Journal of Survey Statistics and Methodology. 2013;1:111–117. [Google Scholar]
- Saris WE. Ten years of interviewing without interviews: The telepanel. In: Couper M, Baker RP, Bethlehem J, Clark CZF, Martin J, Nicholls WL, O’Reilly J, editors. Computer assisted survey information collection. New York: Wiley; 1998. pp. 409–429. [Google Scholar]
- Sarie WE, de Pijper WM. Computer assisted interviewing using home computers. European Research. 1986;14:144–150. [Google Scholar]
- Scherpenzeel A, Das M. True longitudinal and probability-based internet panels: Evidence from the Netherlands. In: Das M, Ester P, Kaczmirek L, editors. Social and behavioral research and the internet: Advances in applied methods and research strategies. New York: Taylor & Francis; 2011. pp. 77–104. [Google Scholar]
- Scherpenzeel AC, Toepoel V. Recruiting a probability sample for an online panel: Effects of contact mode, incentives, and information. Public Opinion Quarterly. 2012;76(3):470–490. [Google Scholar]
- Schonlau MA, Soest A, Kapetyn A, Couper M. Selection bias in web surveys and the use of propensity scores. Sociological Methods and Research. 2009;37:291–318. [Google Scholar]
- Tourangeau R, Conrad FG, Couper MP. The Science of Web Surveys. Oxford University Press; Oxford: 2013. [Google Scholar]
- Voigt LF, Schwartz SM, Doody DR, Lee SC, Li CI. Feasibility of including cellular telephone numbers in random digit dialing for epidemiologic case-control studies. American Journal of Epidemiology. 2011;173 (1):118–126. doi: 10.1093/aje/kwq322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yeager DS, Krosnick JA, Chang L, Javitz HS, Levendusky MS, Simpser A, Wang R. Comparing the accuracy of RDD telephone surveys and internet surveys conducted with probability and non-probability samples. Public Opinion Quarterly. 2011;75:709–747. [Google Scholar]