Skip to main content
Online Journal of Public Health Informatics logoLink to Online Journal of Public Health Informatics
. 2014 Apr 29;6(1):e44. doi: 10.5210/ojphi.v6i1.5106

From Noise to Characterization Tool: Assessing Biases in Influenza Surveillance Methods Using a Bayesian Hierarchical Model

Ying Zhang 1,*, Ali Arab 1, Michael A Stoto 1, Bejamin J Cowling 2
PMCID: PMC4050797

Objective

Our goal is to develop a statistical framework to characterize influenza surveillance systems and their sensitivity to information environment.

Introduction

Infectious disease surveillance is a process, the product of which reflects both real illness and public awareness of the disease (Figure 1). According to our previous research studies [1,2], decisions made by patients, healthcare providers, and public health professionals about seeking and providing healthcare and about reporting cases to health authorities are all influenced by the information environment, which changes constantly. Biases are therefore imbedded in each surveillance systems, and need to be assessed to provide better situational awareness for decision-making.

Methods

We identified influenza surveillance data from Hong Kong covering health care providers, laboratories and residential care homes for the elderly. A Bayesian hierarchical model was developed to estimate the statistical relationships between influenza surveillance data and information environment data (e.g. HealthMap, Google).

For data in percentages:

Data model: [data|process,data parameters]

Log(Yj,t)~N(μj,tj2)

Process model: [process|process parameters]

μj,tj,t*Xtj,t

μj,t=(βj,t,1j,t,2*kp1,t+...+βj,t,m*kpm-1,t) *Xtj,t,1j,t,2*kpm,t+...+αj,t,n*kpn-1,t

εt~N(0,σj2)

For data counts:

Data model: [data|process,data parameters]

Yj,t~Pois(λj,t)

Process model: [process|process parameters]

Log(λj,t)=θj,t*Xtj,t

Log(λj,t)=(βj,t,1j,t,2*kp1,t+...+βj,t,m*kpm-1,t) *Xtj,t,1j,t,2*kpm,t+...+α-j,t,n*kpn-1,t

For both count and percentage data, the parameter model is:

Parameter model: [data and process parameters]

βj,t,m~dnorm(0,.01)

αj,t,n~dnorm(0,.01)

σj2~dgamma(.01,.01)

Xt is the estimated influenza incidence rate in the whole population; Yj,t refers to surveillance system j at time t. θj,t describes the component of true infections that surveillance system j captures at time t, which is further fitted into a linear regression model with predictive variables of information environment index. φj,t - also fitted into a linear regression model with another set of predictors - is defined as perception bias, which estimates the proportion in the surveillance data which cannot be fully explained by the true infections. βj,t,m and αj,t,n are the coefficients for a set of information environment kpl,t (l=1,...,m-1,m,...n-1) during the pandemic period. Using Markov Chain Monte-Carlo (MCMC) method in Open- BUGS, a posterior distribution was generated for every parameter to characterize each data streams.

Results

The model identified surveillance systems characteristics - percentages, broad case definition, and senior population - that are more resistant to the information environment. General practitioner (%ILIvisit) and Laboratory (%positive) systems seem to capture the true infection at a constant proportion, and are less influenced by information environment. Surveillance systems with influenza-specific case definitions tend to reflect biases of both healthcare seekers and providers.

Conclusions

The study identified the characteristics that are likely to be associated with better performance when information environment is changing rapidly. Moreover, the characterization tool, can help practitioners make a more informed decision on which surveillance systems to monitor, given their primary concerns of real illness versus public awareness.

graphic file with name ojphi-06-e44-g001.jpg

Conceptual model for biases in influenza surveillance data

Acknowledgments

This research was conducted with funding support awarded to the Harvard School of Public Health under cooperative agreements with the US CDC grant number 5P01TP000307-01. The data was provided by Centre for Health Protection and Hospital Authority in Hong Kong, and HealthMap.

References

  • 1.Zhang Y, Lopez-Gatell H, Alpuche-Aranda CM, Stoto MA. 2013. Did Advances in Global Surveillance and Notification Systems Make a Difference in the 2009 H1N1 Pandemic?-A Retrospective Analysis. PLoS ONE. 8(4), e59893 10.1371/journal.pone.0059893 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Stoto MA. 2012. The effectiveness of U.S. public health surveillance systems for situational awareness during the 2009 H1N1 pandemic: a retrospective analysis. PLoS ONE. 7(8), e40984 10.1371/journal.pone.0040984 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Online Journal of Public Health Informatics are provided here courtesy of JMIR Publications Inc.

RESOURCES