Table 1. Surveillance System Attributes for Traditional Sources of Surveillance Information and Electronic Health Records (EHRs).
Surveillance system attributes | Traditional national surveillance surveysa
|
EHRsb
|
||
---|---|---|---|---|
Strengths | Weaknesses | Strengths | Weaknesses | |
Timeliness | NA | Can take years between data collection and availability | Available soon after collected | NA |
Content and scope | In-depth availability of patient-reported data on behaviors; extensive collection of social determinants of health data | Limited sample sizes, especially for less common sociodemographic groups | Data on millions of patients provides ability to estimate disease prevalence for rare diseases, less common subgroups (Native Hawaiian/Pacific Islander, American Indian/Alaska Native), and small area geographic units and population-based cohorts | Limited availability of patient-reported data; social determinants data availability increasing but limited to insurance type and linked Census data for many EHRs |
Structured data; data subjectivity; longitudinal data | Objectively measured health outcomes (vitals, laboratory values) according to study protocol | Cross-sectional or panel designs limit longitudinal follow-up | Longitudinal follow-up on patients allows tracking changes over time; data available on disease control over time | Many data are unstructured (eg, patient notes) and less available for use; structured data standardization is variable; identification of diseases often depends on use of nonspecific diagnostic codes; prescription data typically available but pharmacy dispensing may not be |
Representativeness | Nationally representative by design; typically covers entire US population with probability-based sampling strategies | Certain populations can be under-represented (eg, people without a landline telephone, the institutionalized population); characteristics of respondents may differ from nonrespondents in measured or unmeasured ways | Some research networks have data available on people in all US states and territories; patients with multiple types of insurance (commercial and government insurance) are typically available | Representative of care-seeking population, which may limit broad surveillance questions at the population level; representativeness of urban versus rural populations dependent on institutions contributing data |
Data quality, completeness | Data collected according to study protocol; robust data completeness and curation | Telephone surveys used in some programs reliant on self-report; all surveys subject to nonresponse | Objective measures of some disease (eg, diabetes, obesity) and robust computable phenotypes of others | Missing data are common; data not collected according to a standardized protocol |
Resources required | Infrastructure established by federal agencies to collect data; sampling and weighting strategies well validated and centrally applied by data collectors; some flexibility on adding new questions and data elements | Requires substantial resources and staff to facilitate | Data collected for routine clinical activities and only additional resources for collection required for new data elements | Data processing requires substantial resources, especially to address data quality issues that can arise; adding new data elements challenging |
Abbreviation: NA, not applicable.
Examples: National Health and Nutrition Examination Survey (NHANES, www.cdc.gov/nchs/nhanes), Behavioral Risk Factor Surveillance System (BRFSS, www.cdc.gov/brfss).
Example: National Patient-Centered Clinical Research Network (PCORnet).