Methods for the Design and Administration of Web-based Surveys

Titus K L Schleyer; Jane L Forrest

doi:10.1136/jamia.2000.0070416

. 2000 Jul-Aug;7(4):416–425. doi: 10.1136/jamia.2000.0070416

Methods for the Design and Administration of Web-based Surveys

Titus K L Schleyer ¹, Jane L Forrest ¹

PMCID: PMC61445 PMID: 10887169

Abstract

This paper describes the design, development, and administration of a Web-based survey to determine the use of the Internet in clinical practice by 450 dental professionals. The survey blended principles of a controlled mail survey with data collection through a Web-based database application. The survey was implemented as a series of simple HTML pages and tested with a wide variety of operating environments. The response rate was 74.2 percent. Eighty-four percent of the participants completed the Web-based survey, and 16 percent used e-mail or fax. Problems identified during survey administration included incompatibilities/technical problems, usability problems, and a programming error. The cost of the Web-based survey was 38 percent less than that of an equivalent mail survey. A general formula for calculating breakeven points between electronic and hardcopy surveys is presented. Web-based surveys can significantly reduce turnaround time and cost compared with mail surveys and may enhance survey item completion rates.

The Web-based survey described in this article was designed to investigate the use of the Internet in clinical practice by 450 dental professionals. The results of the survey itself have been published previously.¹ This paper describes the design and implementation of the survey in detail to assist other researchers who are considering using Web-based surveys. From a review of the background literature and our own experiences, we present issues in sampling for electronic surveys; survey design, programming, testing, and administration; potential problems and pitfalls; and cost comparisons between electronic and hardcopy surveys. We developed several general breakeven calculations based on cost, provided all other variables are equal, to help researchers choose between electronic and traditional mail surveys.

Background

Several recent publications have reported use of the Internet to conduct survey research.²^,³^,⁴^,⁵^,⁶^,⁷^,⁸^,⁹ Investigators in the fields of medicine, psychology, sociology, dentistry, and veterinary medicine are recruiting participants for their research studies by targeting specific search engines, newsgroups, and Web sites. Participants often answer surveys by returning a completed form by e-mail or by entering their responses directly on a Web site. Commonly cited advantages include easy access, instant distribution, and reduced costs. In addition, the Internet allows questionnaires and surveys to reach a worldwide population with minimum cost and time. Researchers can contact rare and hidden populations that are often geographically dispersed,³ as well as patient populations different from those typically seen in the clinical or hospital setting.²^,¹⁰

Other reported benefits relate to graphical and interactive design on the Web. Ideally, HTML survey forms enhance data collection, compared with conventional surveys, because of their use of color, innovative screen designs, question formatting, and other features not available with paper questionnaires. They can prohibit multiple or blank responses by not allowing the participant to continue on or to submit the survey without first correcting the response error. This feature is somewhat controversial, because there may be legitimate reasons for not answering questions, and responses such as “don't know” or “prefer not to answer” force an answer when participation and question response is supposed to be voluntary.¹¹ Regardless of one's view on this issue, the program can provide cues to make sure the respondent does not inadvertently skip a question. In addition, coding errors and data entry mistakes are reduced or eliminated while compilation of results can be automated.¹² Finally, online forms can help minimize costs, facilitate rapid return of information by participants, and allow timely dissemination of results by investigators.¹³

Several examples show how the Internet is used for survey research. Physicians in Germany developed a Web-based patient information system about atopic eczema to attract patients to the Web site and Internet survey.² The purpose of the survey was to explore the relations between atopic stigmata and its symptoms, predisposing factors, patient demographics, and associations with other diseases. As an incentive to fill out the survey, an atopy score was calculated and presented to the participant upon completion. Approximately 240 subjects complete the survey each month. Healthy Web surfers serve as controls.²

In another study, researchers at Columbia University explored the properties of a new measure of sexual orientation by monitoring network traffic on an intranet over a two-week period and collecting all postings to two newsgroups related to their topic of study.³ From the formulated list of e-mail addresses, 360 subjects were randomly selected. Subjects were notified of their selection, and those who consented to participate were e-mailed a survey. Of the participants who were contacted, 66.1 percent provided their consent to participate and 56.4 percent of that group returned completed surveys.³

Veterinarians conducted research via e-mail and Web pages to investigate causes of dog death in small veterinary practices.⁴ In this study, 25 veterinarians submitted case material. On the basis of analysis by region and school attended, the investigators found that participants were representative of the veterinarian population in the United States.

Nursing researchers have found the Internet a valuable vehicle for collecting data from cancer survivors.⁷ In this study, three cancer-related newsgroups were used to distribute the Cancer Survivors Survey Questionnaire. This method proved useful for collecting preliminary data, which are often needed to demonstrate the feasibility of conducting a large-scale study and for determining adequate sample size.

Theoretically, conducting research over the Internet has many benefits. However, survey experts and researchers warn that the current online population is not representative of the general population in the United States. Estimates of computer ownership and e-mail access vary depending on how data were gathered, e.g., face-to-face or via telephone, and how it is reported, e.g., household computer ownership vs. “access to” computers.¹⁴^,¹⁵ For example, in 48,000 face-to-face interviews conducted in 1997, 37 percent of households in the United States reported owning a computer, 19 percent reported online access, and 17 percent reported e-mail access. In comparison, through telephone polls, 67 percent reported having access to a computer and 31 percent had an e-mail address.

While access to e-mail and the Internet grow daily, a “digital divide” exists among age and racial groups, income levels, and geographic settings.¹⁴ Ensuring that each potential respondent has an equal chance of being selected to participate poses a major challenge in conducting a scientifically sound survey. This is especially true in the health sciences, where electronic access to specific provider or patient groups cannot easily be obtained.⁹ Currently, not all health professional associations or licensing boards collect e-mail addresses, nor is it possible to estimate the number of individuals with the particular health state of interest who have access to computers and the Internet.⁷ However, rigorous sample selection procedures must be followed if results are to be generalized to a population and sources of coverage and sampling error are to be kept to a minimum.¹¹^,¹⁶

Unfortunately, the sampling procedures reported in many electronic surveys reflect unknown samples.²^,³^,⁵^,¹³ When subjects are recruited by targeting newsgroups or search engines, it is nearly impossible to determine the distribution of the sample population. These survey procedures should be used only when sampling and self-selection biases can be tolerated.

Another concern unique to conducting electronic surveys is the variation of the level of computer literacy among respondents and the capabilities of their computers. Internet users tend to be highly educated white men between the ages of 26 and 30 years.¹³ Even so, their experience responding to online questionnaires may be limited. Thus, Web-based surveys need to have clear directions on how to perform each needed skill, e.g., how to enter answers with a dropdown box or erase responses from a check box¹¹ so that responding to the questionnaire does not become a frustrating experience.

Providing specific instructions will assist respondents in accurately completing and returning the survey, provided their computer is capable of receiving it in the first place. Differences among computers, such as their processing power, memory, connection speeds, and browsers, potentially negate some of the benefits purported for using the Web. For example, the use of graphics and animation may increase the attractiveness and novelty of participating. However, advanced Web progamming features, such as Java, JavaScript, DHTML, or XML either may be incompatible with certain browsers or may cause them to respond slowly or crash. In The Influence of Plain vs. Fancy Designs on Response Rates for Web Surveys, Dillman et al.¹⁷ showed that such features can actually lower response rates. In this study, a plain questionnaire obtained a higher response rate than one that used tables and colors. The plain design also was more likely to be fully completed in a shorter period of time.

Dillman et al. proposed three criteria and 11 supporting principles for designing respondent-friendly Web questionnaires, some of which were used to guide the development of the study presented in this article.¹¹ These criteria include:

Take into account the inability of some respondents to receive and respond to Web questionnaires with advanced programming features that cannot be received or easily responded to because of equipment, browser, and/or transmission limitations.
Take into account both the logic of how computers operate and the logic of how people expect questionnaires to operate.
Take into account the likelihood that a Web questionnaire will be used in mixed-mode survey situations.

The next section describes the purpose of the survey described in this article, how the sample was selected, and how the survey was designed, pilot tested, and administered.

Survey Development and Administration

The Study

The Web-based survey described in this article was designed to investigate the use of the Internet in clinical practice by 450 dental professionals. There were three primary reasons for choosing a Web-based survey method. First, the survey population used e-mail, since all participants subscribed to an Internet discussion list. Use of e-mail is not an absolute indicator of Web use; however, since discussions often referenced Web sites, it seemed likely that the majority of individuals used the Web. The survey results confirmed this assumption. Second, because of an imposed deadline, survey development, implementation, and data analysis had to be completed within eight weeks, which made it impossible to conduct a traditional mail survey. Finally, funds or other resources for the production of a hardcopy survey, postage, and data entry were not available.

Sample Selection

A random sample of dentists could not be selected because no comprehensive list of dentists with e-mail addresses was available. Consequently, the largest discussion list for general dentistry (Internet Dental Forum) was identified. Selection of the discussion list permitted identification of the total population and controlled follow-up with nonrespondents, blending a methodologically sound approach with a new method of collecting data. The investigators believed that selection of this convenience sample, although not representative of all dentists with Internet access, was more appropriate than soliciting volunteers from general sites with unknown populations. Dr. D. Dodell, list owner of the Internet Dental Forum and member of the project team, made the list of e-mail addresses available. Institutional Review Board approval for this survey was not sought, since the project was exempt under CFR §46.101 (b) (2).

Survey Design

A 22-question survey instrument with a total of 102 discrete answers was developed. Rather than being presented on a single, lengthy Web page, questions were grouped on 18 sequential screens, for two reasons. First, sequential screens kept transmission time to a minimum and avoided potential server time-outs for respondents with slow modem connections (33 KBps and below). Second, the use of sequential screens allowed questions to be displayed completely and prevented the need for participants to scroll through pages and potentially get lost.

▶ illustrates some of the design features of the survey. All screens were designed to display fully at a screen resolution of 800 × 600 pixels. Most screens contained a single question. The top of the screen displayed a static 6 kb JPEG banner with a small picture (which emphasized the clinical aspect of the survey) and the title of the survey. Each question was displayed in bold. List boxes, radio buttons, and check boxes provided answers for close-ended questions. Text fields were available for answers to open-ended questions. Formatting the answers in a table ensured consistency of layout for different browsers, operating systems, and window sizes. The lower left corner of each screen indicated the participant's relative position in the total number of survey screens (in gray). The lower right corner contained buttons to clear the current screen and to move to the next screen. The total file size of each screen averaged about 9 kb.

Several published recommendations and findings for designing survey screens were followed.¹¹^,¹⁷^,¹⁸ The small file size minimized download time. Formatting clearly differentiated questions and answers and deemphasized secondary screen elements. Consistent layout reduced the number of required cognitive adjustments and allowed participants to concentrate on answering the questions. The “next” button, combined with the relative screen indicator, encouraged a page-turning rhythm that resembled completing a hardcopy survey.

To minimize incompatibilities with browsers, survey pages were compliant with HTML 3.0. Neither JavaScript, which is often employed to validate entry fields on the Web, nor Java, ActiveX controls, and other advanced Web programming concepts were used for the reasons cited above.

The survey was programmed in PL/SQL on an Oracle 8 database server. Programming took approximately 35 hours. Code review and testing added another eight hours. Since the code was going to be used only once, the programmer neither optimized the code for performance and maintenance nor added detailed comments. The total length of the program was 2,471 lines. During the code review, the program was reviewed line by line. In the testing phase, each question was answered and the corresponding entry checked in the database. Initially, the survey was programmed to validate every screen (e.g., to check that all fields were filled out, that the zip code was formatted correctly). This feature was deactivated on the basis of the results of the pilot test. A second program also was developed to send survey messages to all participants. This program took three hours to develop and test and totaled 895 lines.

Pilot Testing

After programming, the survey was pilot-tested inhouse and with several remote participants. The program was tested with two different browsers (Netscape Communicator versions 4.0 and 4.5 and Internet Explorer version 4.0), three operating systems (Windows NT 4.0, Windows 95, and Macintosh OS 7.5), two types of Internet access (high-speed local area network and modem dial-up line), and three different Internet service providers. The pilot test did not uncover any technical problems. However, the wording on some questions was slightly modified, and the validation for required input fields was dropped. Several pilot-testers felt that requiring entries in all fields was too restrictive, especially when they felt that a question did not apply to them personally or when the response choices did not exactly match their expectations.

Survey Administration

Next, a list of participants' e-mail addresses was generated from the list of subscribers to the discussion list. This list was imported into Microsoft Excel and a unique, random four-character survey ID for each participant was generated. The IDs were composed of letters and numbers. Participants entered their IDs to authenticate themselves and access the survey. The list was then imported from the Excel file into the Oracle database.

The main survey page (for introduction and login) was hosted on the discussion list server (idf.stat.com) rather than the Oracle server (heracles.dental.temple.edu). Although this method avoided confusion about the origin of the survey, it prevented the investigators from providing a URL that would have directly logged participants into the Oracle server. To begin data collection and identify any significant problems, a survey message was initially sent to 47 participants. The messages originated on the Oracle server were sent through an SMTP mailer program that spoofed^* an e-mail address on the server hosting the discussion list. Participants received a personal message stating the purpose of the survey, who was conducting it, the estimated time required to complete it, the URL of the survey, the survey ID, and who to contact with questions. The survey also instructed them to return duplicate e-mail messages with the subject line “DUPLICATE.” Once authenticated through their survey ID, participants could begin answering questions.

The group of individuals who responded to the first mailing did not report any problems. However, several problems were identified when the survey was mailed to the remaining 403 individuals:

Several individuals stated that they entered their survey ID, pushed submit, and then could not proceed with the survey because of an error message. Unfortunately, the source of this problem could not be tracked down. Most of these participants used America Online as their Internet service provider. Consequently, an ASCII copy of the survey was mounted on the home page to provide an alternative method of answering the survey. Participants were advised to use this method of replying if the Web form failed. At the same time, participants who had had problems were provided with an explanation. A copy of the survey was included in the reply. Completed surveys received through e-mail or fax were entered by hand into the database later.
Some international users with slow modem connections reported that they received a server timeout when trying to answer the survey. As a result, the timeout period for client responses to the server was increased from 60 seconds to five minutes.
When typing in their survey ID, several respondents mistook the digit “0” for the letter “O,” and the digit “1” for the letter “l” and vice versa. Thus, the server rejected their ID. Since a complete URL could not be provided for respondents to click on to access the survey, and since most participants obviously did not copy and paste their survey IDs into the field, respondents were advised by e-mail of the correct way of entering their survey ID.
Several users were not aware that they could include the text of the original message in their reply or paste the survey from the Web page into an e-mail message. Instead, they printed the survey and returned the completed hardcopy by fax. Although this was not a major problem, it delayed the entry of approximately ten surveys into the database.
After receiving approximately 130 surveys, the responses to one question revealed that participants seemed to choose only two of the four responses on a Likert scale. A review of the program revealed an error that stored answers incorrectly. The error was corrected, and the incorrectly stored answers were discarded.

Three additional messages were sent to non-respondents during the following two weeks. ▶ shows the dates and times of the messages, the distribution of responses, and the cumulative response rate.

Number of responses received (left scale) and cumulative response rate (right scale, in percents) by date. The graph includes surveys entered through the Web only and indicates when survey mailings were sent.

The response rate for surveys entered via the Web was 32.9 percent—144 of 438 (adjusted) participants—after the initial mailing. The first follow-up mailing resulted in receipt of 76 surveys and brought the total response up to 50.2 percent. The second follow-up mailing raised the response rate to 57.1 percent, and the third raised it to 64.4 percent (30 and 32 responses, respectively). The 52 surveys returned by e-mail or fax were entered by hand and increased the final response rate to 74.2 percent (334 of 438 participants). We sent a total of 1,132 e-mail messages to participants of this survey. To increase the response rate, the survey was directly included in the e-mail message in the second and third follow-up mailings. In addition, while the initial messages and the first follow-up messages were sent from a generic e-mail account (survey@stat.com), the subsequent follow-up messages were sent from the listowner's account directly, with a personal request for a response to the survey.

The next section compares the costs of this survey with costs if the survey had been administered by mail. General breakeven equations for the sample size using Web-based vs. mail surveys are presented. The section concludes with a comparison of characteristics of Web and e-mail/fax respondents.

Cost and Response Pattern Analysis

Costs

Costs were calculated to assess the cost-effectiveness of the Web-based survey method for planning future surveys. ▶ shows the costs for the Web-based survey compared to the costs of an equivalent mail survey. The comparison excludes costs that are the same regardless of the survey methodology, such as design of the survey instrument and pilot-testing. Also excluded is the cost of obtaining the mailing list.

Table 1.

Cost Comparison of a Web-based Survey and Non-anonymous and Anonymous Mail Surveys

			Mail Survey
	Web-based Survey			Non-anonymous (1,132 mailings)	Anonymous (1,800 mailings—4 × 450 participants)
Survey preparation	Programming (35 hrs × $30)	$1,050	Printing/duplication of a 7-page survey and the cover letter ($0.08/page)	$724	$1,152
	Testing and code review (8 hrs × $60)	480	Envelopes at $0.055/envelope	62	99
			Business reply envelopes at $0.033/envelope	37	59
Distribution and return	Bulk mailer program (3 hrs × $60)	180	Mailing of surveys at $0.65/envelope ($0.55 postage + $0.10 envelope stuffing)	736	1,170
	Sending e-mail messages	0
			Return postage on 334 surveys ($0.63/survey)	210	210
Data entry	Entry by participant	0	Entry by investigator/staff member (334 × 6 min × $0.66)	1,323	1,323
	Manual entry of 52 surveys (52 × 6 min × $0.66)	206
Total		$1,916		$3,092	$4,013
Note: The calculations for the non-anonymous survey use actual response rates. The calculations for the anonymous survey assume that each participant receives the mailing four times, whether they replied or not. Personnel costs include fringe benefits (30%).

Open in a new tab

As ▶ shows, the total cost of the Web survey was $1,916, comprising the costs for programming and testing the survey, programming the bulk mailer program, and performing limited manual data entry. If all respondents had completed the Web-based survey successfully, the cost would have dropped by $206. Costs for an equivalent mail survey are calculated both for a non-anonymous survey (our case) and an anonymous survey. The two alternatives differ in their cost of preparing and mailing surveys. In non-anonymous surveys, surveys are prepared for and sent to non-respondents only after the initial mailing. Anonymous surveys require that all participants receive the initial and all follow-up mailings and are thus somewhat more expensive. We present different breakeven calculations (see below) for these two options.

If our survey had been administered as a mail survey, its total cost would have been $3,092, including the cost of preparing 1,132 mailings, mailing costs, postage for returned mailings, and data entry. The Web-survey was 38 percent cheaper than the equivalent (non-anonymous) mail survey. The figures used for the calculations in ▶ represent local costs for a mail survey of 450 individuals. In other settings, costs might differ on the basis of factors such as sample size, reproduction costs, study requirements, programming costs, and data entry costs. Costs arising from handling technical problems for the Web-based survey (e.g., responding to user questions) were disregarded, since the required time was minimal and an improved design could have avoided most of those problems.

As ▶ shows, the cost of a Web-based survey is independent of the sample size, whereas the costs of a mail survey vary with the initial sample size as well as with the incremental and the total response rate. Avoiding manual entry of completed surveys generated significant cost savings, a fact that has not been lost on transaction-intensive industries such as airlines and banks.

To assist others in choosing between Web-based and mail surveys on the basis of cost (assuming that all other variables are equal), we use a standard breakeven calculation¹⁹ to determine the sample size for which both types of surveys cost exactly the same. This point is the threshold for which conducting one type of survey becomes more cost-effective than the other.

Equations 1 and 2 in ▶ are used to calculate the breakeven point for non-anonymous and anonymous surveys, respectively, when the incremental and/or the final response rates can be estimated. When that is not possible, equations 3 and 4 can be used to calculate lower and upper bounds for the breakeven point.

Table 2.

Equations to Determine Breakeven Sample Size

Assumptions	Equations
Incremental and/or final response rates can be estimated:	Non-anonymous Survey^*:	Anonymous Survey^†:

Incremental and/or final response rates cannot be estimated:	Both types of survey:

Note: Equations to determine the breakeven sample size for non-anonymous surveys (equation 1) and anonymous surveys (equation 2). If incremental or final response rates cannot be estimated, equations 3 and 4 can be used to determine the boundaries for the breakeven point. The equations make the assumption that the denominator is not zero and that the Web-based survey is entered successfully by each respondent. In addition, equations 1 and 2 assume that the cumulative response rate never reaches 100% before the last mailing. The variables are as follows: n, breakeven point; n₁, n_u, lower and upper bounds for breakeven point; m, number of mailings; r₁, r₂,... r_m, cumulative response rate after the 1st, 2nd,... mth mailing (r_m is the final response rate); C_PR, cost of programming the survey; C_D, cost of preparation per survey (duplication, envelopes, etc.); C_M, cost of mailing per survey; C_E, cost of receipt and data entry per survey.

Open in a new tab

Repeat mailings of the non-anonymous survey are sent to nonrespondents only.

^†

Repeat mailings of the anonymous survey are sent to all participants.

For the described survey, n would have been 245 under the idealistic assumption that all respondents successfully answered the survey through the Web. Even if the costs of entering of the surveys manually are included, the breakeven point rises only to 274. Thus, with a sample size of approximately 275 or below, a mail survey would have been more economical than a Web-based survey.

Equation 1 makes two assumptions of practical significance. First, it assumes that exactly as many hardcopy surveys are prepared as needed. In reality, this is rarely possible. Equation 1 thus will often reflect slightly lower costs for a mail survey than are achievable, slanting the comparison in favor of mail surveys. Second, it assumes that incremental and final response rates can be estimated. When this is not possible, we can calculate a range for the breakeven point by approximating the extreme values of equation 1 through equations 3 and 4. Using our costs, the lower bound for the breakeven point would have been 190 and the upper bound 347. This means that for a sample size of 189 or less, a mail survey would have been more economical, and with a sample size of 348 or more, a Web-based survey. In between, the cost advantage would have depended on the actual incremental and final response rates. Equations 3 and 4 thus provide a useful heuristic for determining a range for the breakeven point if basic costs for the two survey methods are known.

Comparison of Web and E-mail/Hardcopy Responses

The goal of using a Web-based survey was to have all respondents complete the questionnaire using the Web. However, we had to provide an alternative method to avoid converting individuals with technical or user problems into non-respondents. Once we allowed participants to answer the survey by e-mail or fax, some may have chosen one of these methods based on personal preference or convenience. The data were reviewed to discern potential patterns that might distinguish Web from hardcopy respondents.

Sixteen percent of the 334 respondents returned the completed survey by e-mail or fax. Although no differentiation was made between e-mail and fax responses, the majority were received by e-mail. America Online users submitted 31 percent of the e-mail/fax responses and 12 percent of the Web responses. E-mail messages about technical problems were received most frequently from America Online users. Thus, at least some of the technical problems were due to incompatibilities with America Online. One reason that some AOL users were successful in submitting the survey through the Web may have been their use of different versions of the AOL client software.

Chi-squared tests (P = 0.05) were used to test for independence between the type of response (either Web or e-mail/fax) and the following variables: top-level domain of the participant (either com, net, or other); self-reported computer experience (“not at all comfortable,” “not very comfortable,” “comfortable,” “very comfortable”); self-reported years of Internet experience (1, 2, 3, 4, 5, 6, >6); and, the number of fields left empty on the survey (<21, 21-25, 26-30). Self-reported computer experience showed a significant relationship with the type of response (X² = 10.3; df = 2; P = 0.006), indicating that respondents more comfortable with computers tended to be more successful in completing the Web survey. Respondents answering the Web survey also tended to complete more fields on the survey (X² = 37.3; df = 2; P = 0.001).

The other two variables showed no relationship to the type of response. Thus, two hypotheses for future studies could be that successful completion of Web surveys is dependent on computer experience and that respondents to Web surveys complete more questionnaire items than e-mail/hardcopy respondents.

Conclusion

Several authors have proposed guidelines on how to conduct Web-based surveys.¹¹^,¹³^,¹⁷^,¹⁸^,²⁰ However, few papers report the use of this method, and none describe its procedural aspects in detail.²^,³^,⁴^,⁵ One primary reason that the Web is not used frequently for largescale or general surveys may be that Web access is not a given. Although 19 percent of households in the United States reported being online in 1997, not all members of each household may use the computer.²¹ Even in a recent study, the first step in the survey was to find out whether participants used the Internet or not.¹⁷ Although consumer research companies are beginning to tap AOL's 17 million users for market research,²² it has not been proved that AOL's users are representative of the U.S. population at large. Until the e-mail address becomes as widely used as the postal address, large survey populations cannot be surveyed using the Web or e-mail alone.

As previously mentioned, some authors advocate publishing Web surveys through newsgroups, indexes, and search engines.¹³ However, the resulting selection bias makes the results less valid and generalizable. Where defined populations are accessible through the Internet, a Web-based survey can be an effective method of gathering data using rigorous survey methodologies. However, even a relatively large group of Internet users may still represent a convenience sample that allows generalization to only that group.

Several authors have made recommendations for Web-survey design.¹¹^,¹⁸ On the basis of our experiences in this case study, some additional recommendations can be made:

Consider Web-based surveys software development projects. Unless an off-the-shelf survey application²³^,²⁴ is used, several tools must be integrated (such as HTML forms and PERL scripts). Sometimes survey applications must be custom-programmed. A survey application should be tested thoroughly to reduce the number of software defects and incompatibilities. Depending on the sample population, testing variables may include operating systems, browsers, Internet service providers, and Internet connection types. A systematic approach to identifying all variables and testing can reduce the chance of failure.
Usability and survey design should reflect the characteristics of the sample population and its environment. If the computer literacy of the sample population is not known, the survey should be easy to complete in as few steps as possible. In this case study, simple usability issues became major problems for some participants. Among participants who have a high degree of computer literacy, usability may be less of a issue. Likewise, surveys designed for unknown computing environments should use the lowest common technical denominator. In contrast, a survey application could use advanced programming techniques that match the capabilities of standardized computing environments on an intranet.
Pilot-test with a sufficiently large random sample of the sample population. It is often impossible to test all environments for a survey before its release. Thus, it is very important to pilot-test with a sufficiently large random sample of users. This will, it is hoped, account for computer literacy levels and the range of computer configurations.
Scrutinize early returns earlier. Scrutinizing early returns is important in mail surveys.²⁵ The potentially immediate response to electronic surveys makes this recommendation even more important. As this example has shown, close monitoring of early returns can identify problems that were not caught during the pilot test, from technical issues to program bugs. Immediate resolution of such problems is required to prevent an unnecessarily large portion of non-respondents or increased measurement error.

This case study has shown that surveys administered through the Web can, compared with mail surveys, potentially lower costs, reduce survey administration overhead, and collect survey data quickly and efficiently. However, it also confirms that a number of variables have the potential to influence survey response and measurement negatively, such as incompatibility with the target computing environment, survey usability, computer literacy of participants, and program defects. In this case study, respondents to the Web survey tended to complete more questionnaire items than respondents who used e-mail or fax. Because of its potential impact on the quality of data collection, this finding should be validated in future studies.

Acknowledgments

The authors thank Dr. R. Kenney, Dr. D. Dodell, and N. Dovgy for their help in conducting this project, Dr. Sorin Straja for his help with the statistical analysis, Ms. Syrene Miller for her assistance in the literature review, and the reviewers for their helpful suggestions and comments.

This work was supported in part by grant T15-LM07059 from the National Library of Medicine/National Institute of Dental and Craniofacial Research.

Footnotes

Spoofing is a common technique to fool hardware and software in networked environments. Spoofing an e-mail address, for instance, makes a message appear to be sent from someone else than the actual sender.

References

1.Schleyer T, Forrest J, Kenney R, Dodell D, Dovgy N. Is the Internet useful in clinical practice? J Am Dent Assoc. 1999; 130: 1502-11. [DOI] [PubMed] [Google Scholar]
2.Eysenbach G, Diepgen TL. Epidemiological data can be gathered with World Wide Web. BMJ. 1998;316(7124): 72. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Sell RL. Research and the Internet: an e-mail survey of sexual orientation. Am J Public Health. 1997;87(2): 297. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Gobar GM, Case JT, Kass PH. Program for surveillance of causes of death of dogs, using the Internet to survey small animal veterinarians. J Am Vet Med Assoc. 1998;213(2): 251-6. [PubMed] [Google Scholar]
5.Schleyer T, Spallek H, Torres-Urquidy MH. A profile of current Internet users in dentistry. J Am Dent Assoc. 1998;129: 1748-53. [DOI] [PubMed] [Google Scholar]
6.Lakeman R. Using the Internet for data collection in nursing research. Comput Nurs. 1997;15(5): 269-75. [PubMed] [Google Scholar]
7.Fawcett J, Buhle EL Jr. Using the Internet for data collection: an innovative electronic strategy. Comput Nurs. 1995;13(6): 273-9. [PubMed] [Google Scholar]
8.Swoboda WJ, Mḧlberger N, Weitkunat R, Schneeweib S. Internet surveys by direct mailing. Soc Sci Comput Rev. 1997;15(3): 242-55. [Google Scholar]
9.Hilsden RJ, Meddings JB, Verhoef MJ. Complementary and alternative medicine use by patients with inflammatory bowel disease: an Internet survey. Can J Gastroenterol. 1999;13(4): 327-32. [DOI] [PubMed] [Google Scholar]
10.Soetikno RM, Mrad R, Pao V, Lenert LA. Quality-of-life research on the Internet: feasibility and potential biases in patients with ulcerative colitis. J Am Med Inform Assoc. 1997;4(6): 426-35. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Dillman D, Tortora RL, Bowker D. Principles for constructing Web surveys. Presented at the Joint Meetings of the American Statistical Association; Dallas, Texas; August 1998.
12.Clark R, Maynard M. Research methodology: using online technology for secondary analysis of survey research data —“act globally, think locally.” Soc Sci Comput Rev. 1998;16(1): 58-71. [Google Scholar]
13.Houston JD, Fiore DC. Online medical surveys: using the Internet as a research tool. MD Comput. 1998;15(2): 116-20. [PubMed] [Google Scholar]
14.National Telecommunications and Information Administration. Falling through the Net: new data on the digital divide. Available at: http://www.ntia.doc.gov/ntiahome/net2/falling.html. Accessed Dec 8, 1999.
15.Intelliquest. Latest IntelliQuest survey reports 62 million American adults access the Internet/online services. Available at: http://www.intelliquest.com/press/release41.asp. Accessed Dec 8, 1999.
16.Groves R. Survey errors and survey costs. New York: John Wiley, 1989.
17.Dillman D, Tortora RL, Conradt J, Bowker D. Influence of plain vs. fancy design on response rates for Web surveys. In: Proceedings of the Joint Statistical Meetings, Survey Methods Section. Alexandria, Va: American Statistical Association, 1998.
18.Turner JL, Turner DB. Using the Internet to perform survey research. Syllabus. 1999;12(Jan): 55-6. [Google Scholar]
19.Weygandt J, Kieso D, Kell W. Accounting Principles. 2nd ed. New York: John Wiley, 1990.
20.Turner JL, Turner DB. Using the Internet to perform survey research. Syllabus. 1998;12(Nov/Dec): 58-61. [Google Scholar]
21.Stets D. Who's using computers? Philadelphia Inquirer. 1995 Nov 19, 1995:sect D3.
22.Kranhold K. Foote Cone turns to AOL for online polls. Wall Street Journal. Jul 28, 1999:sect B1.
23.Senecio Software Inc. Online and disk-by-mail surveys. Available at: http://www.senecio.com/. Accessed Dec 8, 1999.
24.Perseus Development Corporation. A survey software package for conducting Web surveys. Available at: http://www.perseus.com/. Accessed Dec 8, 1999.
25.Dillman DA. Mail and Telephone Surveys. New York: John Wiley, 1978.

[ref1] 1.Schleyer T, Forrest J, Kenney R, Dodell D, Dovgy N. Is the Internet useful in clinical practice? J Am Dent Assoc. 1999; 130: 1502-11. [DOI] [PubMed] [Google Scholar]

[ref2] 2.Eysenbach G, Diepgen TL. Epidemiological data can be gathered with World Wide Web. BMJ. 1998;316(7124): 72. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref3] 3.Sell RL. Research and the Internet: an e-mail survey of sexual orientation. Am J Public Health. 1997;87(2): 297. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref4] 4.Gobar GM, Case JT, Kass PH. Program for surveillance of causes of death of dogs, using the Internet to survey small animal veterinarians. J Am Vet Med Assoc. 1998;213(2): 251-6. [PubMed] [Google Scholar]

[ref5] 5.Schleyer T, Spallek H, Torres-Urquidy MH. A profile of current Internet users in dentistry. J Am Dent Assoc. 1998;129: 1748-53. [DOI] [PubMed] [Google Scholar]

[ref6] 6.Lakeman R. Using the Internet for data collection in nursing research. Comput Nurs. 1997;15(5): 269-75. [PubMed] [Google Scholar]

[ref7] 7.Fawcett J, Buhle EL Jr. Using the Internet for data collection: an innovative electronic strategy. Comput Nurs. 1995;13(6): 273-9. [PubMed] [Google Scholar]

[ref8] 8.Swoboda WJ, Mḧlberger N, Weitkunat R, Schneeweib S. Internet surveys by direct mailing. Soc Sci Comput Rev. 1997;15(3): 242-55. [Google Scholar]

[ref9] 9.Hilsden RJ, Meddings JB, Verhoef MJ. Complementary and alternative medicine use by patients with inflammatory bowel disease: an Internet survey. Can J Gastroenterol. 1999;13(4): 327-32. [DOI] [PubMed] [Google Scholar]

[ref10] 10.Soetikno RM, Mrad R, Pao V, Lenert LA. Quality-of-life research on the Internet: feasibility and potential biases in patients with ulcerative colitis. J Am Med Inform Assoc. 1997;4(6): 426-35. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref11] 11.Dillman D, Tortora RL, Bowker D. Principles for constructing Web surveys. Presented at the Joint Meetings of the American Statistical Association; Dallas, Texas; August 1998.

[ref12] 12.Clark R, Maynard M. Research methodology: using online technology for secondary analysis of survey research data —“act globally, think locally.” Soc Sci Comput Rev. 1998;16(1): 58-71. [Google Scholar]

[ref13] 13.Houston JD, Fiore DC. Online medical surveys: using the Internet as a research tool. MD Comput. 1998;15(2): 116-20. [PubMed] [Google Scholar]

[ref14] 14.National Telecommunications and Information Administration. Falling through the Net: new data on the digital divide. Available at: http://www.ntia.doc.gov/ntiahome/net2/falling.html. Accessed Dec 8, 1999.

[ref15] 15.Intelliquest. Latest IntelliQuest survey reports 62 million American adults access the Internet/online services. Available at: http://www.intelliquest.com/press/release41.asp. Accessed Dec 8, 1999.

[ref16] 16.Groves R. Survey errors and survey costs. New York: John Wiley, 1989.

[ref17] 17.Dillman D, Tortora RL, Conradt J, Bowker D. Influence of plain vs. fancy design on response rates for Web surveys. In: Proceedings of the Joint Statistical Meetings, Survey Methods Section. Alexandria, Va: American Statistical Association, 1998.

[ref18] 18.Turner JL, Turner DB. Using the Internet to perform survey research. Syllabus. 1999;12(Jan): 55-6. [Google Scholar]

[ref19] 19.Weygandt J, Kieso D, Kell W. Accounting Principles. 2nd ed. New York: John Wiley, 1990.

[ref20] 20.Turner JL, Turner DB. Using the Internet to perform survey research. Syllabus. 1998;12(Nov/Dec): 58-61. [Google Scholar]

[ref21] 21.Stets D. Who's using computers? Philadelphia Inquirer. 1995 Nov 19, 1995:sect D3.

[ref22] 22.Kranhold K. Foote Cone turns to AOL for online polls. Wall Street Journal. Jul 28, 1999:sect B1.

[ref23] 23.Senecio Software Inc. Online and disk-by-mail surveys. Available at: http://www.senecio.com/. Accessed Dec 8, 1999.

[ref24] 24.Perseus Development Corporation. A survey software package for conducting Web surveys. Available at: http://www.perseus.com/. Accessed Dec 8, 1999.

[ref25] 25.Dillman DA. Mail and Telephone Surveys. New York: John Wiley, 1978.

PERMALINK

Methods for the Design and Administration of Web-based Surveys

Titus K L Schleyer, DMD, PhD

Jane L Forrest, RDH, EdD

Abstract

Background