Skip to main content
The Milbank Quarterly logoLink to The Milbank Quarterly
. 2014 Mar 6;92(1):34–39. doi: 10.1111/1468-0009.12039

Using Social Media and Internet Data for Public Health Surveillance: The Importance of Talking

DAVID M HARTLEY 1
PMCID: PMC3955376  PMID: 24597554

It doesn't have to be like this. Our greatest hopes could become reality in the future with the technology at our disposal. The possibilities are unbounded. All we need to do is make sure we keep talking.

—Stephen Hawking1

Despite progress in public health and the biomedical sciences, infection has yet to be vanquished: vaccine-preventable diseases continue to be transmitted; pandemics occur; previously unknown pathogens emerge; contaminated foods and food products are traded and consumed; and the specter of a post-antibiotic era looms ever larger. Bioterrorism is, and will remain, a danger. Infectious disease is both a national and an international security issue and represents an important threat to human health and well-being.

In order to confront these and related threats, detailed data regarding the global ebb and flow of disease are needed. Over many decades, surveillance methods (often termed “indicator-based” methods) have been developed and refined to provide disciplined, standardized approaches to acquiring and recording important information. More recently, ubiquitous and unstandardized data collected from the Internet have been used to gain insight into emerging disease events. Although this approach—known as “Internet-based biosurveillance,” “digital disease detection,” or, more simply, “event-based” surveillance—has been described and analyzed in the literature,24 systematic reviews of the field have been few.

It is this intellectual gap that makes the article by Edward Velasco and his coworkers in this issue of the Quarterly so valuable and timely. Velasco and his colleagues systematically searched for and reviewed more than 20 years of published studies of event-based systems and approaches, providing a much-needed perspective on both research in the field and several important issues. After selecting relevant peer-reviewed studies to include in their analysis, they extracted the attributes of 13 different event-based systems. They then defined 15 different descriptive attributes that capture the principal facets of event-based systems, including the languages and types of diseases systems covered, the methods by which each system produces its output, and the types of users that each system attracts. Such metrics are necessary for comparing and contrasting different approaches to event-based surveillance. Readers should bear in mind that the properties and lifetimes of these systems are dynamic, as is the Internet itself, and that technologies and methodologies change rapidly, allowing systems to improve and evolve over weeks or months. Accordingly, one of the key contributions of Velasco and colleagues’ study is the set of metrics they propose, in which event-based systems can be tracked over time in order to quantitatively understand how much event-based biosurveillance has changed and continues to change.

Velasco and colleagues seek to provide a basis for public health agencies incorporating event-based methods into existing, comprehensive surveillance programs, and they cite user confidence in this approach as an important step in this process. Their review of the literature, however, uncovered no event-based systems that were regularly incorporated into national programs for surveillance during their study period (1990-2011). Moreover, they found no comprehensive evaluations showing whether or not these systems had been deployed during real-time health events.

Although this evidence may be lacking in the peer-review literature included in their study, there is evidence that several systems are utilized, to varying extents, by national and international public health organizations. At the international level, for example, the World Health Organization (WHO) uses the Canadian-based Global Public Health Intelligence Network in its global alert and response activities.5 The European Centre for Disease Prevention and Control utilizes the MedISys system (http://medusa.jrc.it/medisys/homeedition/en/home.html),4 and a recent study described the evaluation of several event-based systems by international public health professionals.6 At a national level, the US Centers for Disease Control and Prevention (US CDC) utilize event-based data,7 and at the local level, a social media–monitoring program known as Foodborne Chicago is being used to monitor foodborne diseases.8-9 Because the information from the WHO,5 the US CDC,7 and Foodborne Chicago8 are web pages or newspaper stories published after the study period,9 rather than peer-reviewed studies produced during the study period, Velasco and colleagues did not include them. This is less a criticism than an illustration of how quickly event-based data are evolving and of why such information is not necessarily wholly contained in the research literature. Consequently, it will be critical for future studies to include public media and non-peer-reviewed sources in their assessments of event-based data systems.

Of course, a broader question is that regardless of how many public health workers are currently using these systems, what is preventing them from being utilized more broadly and effectively? Here, the approach used in a recent work by Barboza and colleagues is instructive.6 They asked respondents to rate, on a uniform scale, the usability and relative strengths of several event-based systems. The results highlighted the complementarity of different systems and demonstrated the value of using multiple systems to produce the most robust results from the event-based approach. In combination, that study and the work by Valasco and colleagues underscore the importance of consulting stakeholders in the design and refinement of event-based surveillance systems. Accordingly, an assessment of stakeholder engagement would be a useful metric to include in future systematic reviews.

Velasco and colleagues discuss the limitations of event-based systems as well, such as (1) information is not always moderated by professionals or interpreted for relevance before it is disseminated to interested surveillance epidemiologists; (2) there is no standardized system for updates, often resulting in too much information; (3) algorithms and statistical baselines are not well developed; and (4) new information related to health events or probable cases is not always disseminated in the most efficient way. These limitations point to two vital issues.

First, different users have different needs. Some need to see everything reported by event-based surveillance systems (ie, although they are not concerned about specificity, they are concerned about sensitivity), whereas other users may demand low false-alarm rates (ie, specificity is important to their needs). Put another way, some users are more interested in early warnings of threats, so they need to examine all indications of an emerging event. Others, however, are more interested in the situational awareness of identified threats. Thus, interpreting Valasco and colleagues’ findings in the context of diverse users’ needs is paramount.10,11

Second, users must be involved in the design and revision of event-based systems in order to address their specific requirements. This point is central to achieving not only a wider use of event-based surveillance but also its more effective use. If event-based surveillance is to be broadly recognized as a timely modality available to government and public health officials, health care workers, and the public and private sectors, this approach must be refined and strengthened in accordance with methodological, engineering, and user support perspectives.3

One of the most promising new event-based surveillance methods is the use of social media in what is known as “participatory epidemiology.” An example is Flu Near You (https://flunearyou.org/), a system in which any individual 13 years of age or older and living in the United States or Canada can register to complete weekly surveys regarding influenza-like illnesses near them. The information on the site is available to public health officials, researchers, disaster-planning organizations, and the general public, with a mobile application available in addition to a Web interface. Such an approach makes it easy for nonspecialists to contribute, in an open and transparent way, data that may provide a valuable addition to indicator-based surveillance (eg, the U.S. Outpatient Influenza-like Illness Surveillance Network [http://www.cdc.gov/flu/weekly/overview.htm]). The use of mobile applications to collect information, as well as to view and access it in the field, represents an important trend in event-based surveillance.

Finally, for both practical applications and user confidence, determining more precisely whether these systems can improve the early detection and rapid response to infectious outbreaks is important.12,13 One promising example of this trend was recently reported by Chunara and coworkers on the use of Internet-based social and news media to enable the estimation of epidemiological patterns early during the 2010 outbreak of cholera in Haiti.14 Their research team was able to estimate the basic reproductive ratio (R0) in that outbreak, a feat difficult even under normal circumstances using carefully collected epidemiologic data in the field.

For all these reasons, so nicely articulated in the Velasco article, it is safe to state that novel sources of event-driven epidemiological data—along with their accurate use and analysis—will play an even greater role in epidemics and pandemics not yet experienced or even imagined.

References


Articles from The Milbank Quarterly are provided here courtesy of Milbank Memorial Fund

RESOURCES