Abstract
Interest in using internet search data, such as that from the Google Health Trends Application Programming Interface (GHT-API), to measure epidemiologically relevant exposures or health outcomes is growing due to their accessibility and timeliness. Researchers input search term(s), geography and time period, and the GHT-API returns a scaled probability of that search term, given all searches within the specified geo-time period. In this study, we detail a method for using these data to measure a construct of interest in five iterative steps: first, identify phrases the target population may use to search for the construct of interest; second, refine candidate search phrases with incognito Google searches to improve sensitivity and specificity; third, craft the GHT-API search term(s) by combining the refined phrases; fourth, test search volume and choose geographic and temporal scales; and fifth, retrieve and average multiple samples to stabilize estimates and address missingness. An optional sixth step involves accounting for changes in total search volume by normalizing. We present a case study examining weekly state-level child abuse searches in the United States during the COVID-19 pandemic (January 2018-August 2020) as an application of this method and describe limitations.
Keywords: Google, child abuse, abuse
Contributor Information
Krista Neumann, Division of Epidemiology, School of Public Health, University of California, Berkeley, United States.
Susan M Mason, Division of Epidemiology and Community Health, University of Minnesota, Minnesota, United States.
Kriszta Farkas, Division of Epidemiology, School of Public Health, University of California, Berkeley, United States; Division of Epidemiology and Community Health, University of Minnesota, Minnesota, United States.
N Jeanie Santaularia, Division of Epidemiology and Community Health, University of Minnesota, Minnesota, United States.
Jennifer Ahern, Division of Epidemiology, School of Public Health, University of California, Berkeley, United States.
Corinne A Riddell, Division of Epidemiology, School of Public Health, University of California, Berkeley, United States; Division of Biostatistics, School of Public Health, University of California, Berkeley, United States.
Supplementary Material
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
