Skip to main content
Oxford University Press - PMC COVID-19 Collection logoLink to Oxford University Press - PMC COVID-19 Collection
. 2022 Oct 4:kwac171. doi: 10.1093/aje/kwac171

Harnessing Google Health Trends API Data for Epidemiologic Research

Krista Neumann 1,2,, Susan M Mason 2, Kriszta Farkas 3,4, N Jeanie Santaularia 5, Jennifer Ahern 6, Corinne A Riddell 7,8
PMCID: PMC9619602  PMID: 36193858

Abstract

Interest in using internet search data, such as that from the Google Health Trends Application Programming Interface (GHT-API), to measure epidemiologically relevant exposures or health outcomes is growing due to their accessibility and timeliness. Researchers input search term(s), geography and time period, and the GHT-API returns a scaled probability of that search term, given all searches within the specified geo-time period. In this study, we detail a method for using these data to measure a construct of interest in five iterative steps: first, identify phrases the target population may use to search for the construct of interest; second, refine candidate search phrases with incognito Google searches to improve sensitivity and specificity; third, craft the GHT-API search term(s) by combining the refined phrases; fourth, test search volume and choose geographic and temporal scales; and fifth, retrieve and average multiple samples to stabilize estimates and address missingness. An optional sixth step involves accounting for changes in total search volume by normalizing. We present a case study examining weekly state-level child abuse searches in the United States during the COVID-19 pandemic (January 2018-August 2020) as an application of this method and describe limitations.

Keywords: Google, child abuse, abuse

Contributor Information

Krista Neumann, Division of Epidemiology, School of Public Health, University of California, Berkeley, United States.

Susan M Mason, Division of Epidemiology and Community Health, University of Minnesota, Minnesota, United States.

Kriszta Farkas, Division of Epidemiology, School of Public Health, University of California, Berkeley, United States; Division of Epidemiology and Community Health, University of Minnesota, Minnesota, United States.

N Jeanie Santaularia, Division of Epidemiology and Community Health, University of Minnesota, Minnesota, United States.

Jennifer Ahern, Division of Epidemiology, School of Public Health, University of California, Berkeley, United States.

Corinne A Riddell, Division of Epidemiology, School of Public Health, University of California, Berkeley, United States; Division of Biostatistics, School of Public Health, University of California, Berkeley, United States.

Supplementary Material

Web_Material_kwac171

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Web_Material_kwac171

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES