Abstract
In response to rapidly changing societal conditions stemming from the COVID-19 pandemic, we summarize data sources with potential to produce timely and spatially granular measures of physical, economic, and social conditions relevant to public health surveillance, and we briefly describe emerging analytic methods to improve small-area estimation.
To inform this article, we reviewed published systematic review articles set in the United States from 2015 to 2020 and conducted unstructured interviews with senior content experts in public heath practice, academia, and industry. We identified a modest number of data sources with high potential for generating timely and spatially granular measures of physical, economic, and social determinants of health.
We also summarized modeling and machine-learning techniques useful to support development of time-sensitive surveillance measures that may be critical for responding to future major events such as the COVID-19 pandemic. (Am J Public Health. 2022;112(10):1436–1445. https://doi.org/10.2105/AJPH.2022.306917)
Population health surveillance is a cornerstone of prevention, disease control, and disaster response.1 In the early phase of the COVID-19 pandemic, the lack of reliable, granular COVID-19 data by demographic subgroup was a basic failure of the US surveillance infrastructure. The pandemic’s wide-reaching impacts also underscored the need for timely surveillance of physical, economic, and social conditions, also known broadly as social determinants of health (SDOH), to enable early detection of vulnerable groups and prompt action to mitigate health inequities.
The call for surveillance of SDOH is not new. In 2010, social determinants were formally introduced into the Healthy People 2020 framework.2 In 2016, the Centers for Disease Control and Prevention issued a “Public Health 3.0 Call to Action” for local public health and political leaders to leverage resources to address SDOH and health equity.3 The Call to Action was endorsed by the American Public Health Association and included 5 broad recommendations. One of the 5 recommendations focused on surveillance, stressing that
Timely, reliable, granular-level (ie, subcounty), and actionable data should be made accessible . . . including those targeting the social determinants of health and enhancing equity.4(p4)
In response, the US Department of Health and Human Services established an SDOH workgroup within Healthy People 2030.5 The workgroup selected 7 primary SDOH objectives in 2018, with surveillance measures to track them (Table A, available as a supplement to the online version of this article at https://ajph.org). National surveys were identified as data sources, and their probability sampling frames ensured representativeness of national and, in some instances, state-level estimates. Four of the data sources could generate county-level estimates for one third of US counties. Such data are valuable, yet they fall short of the Public Health 3.0 Call to Action for timely and granular (i.e., subcounty) data.
The goal of this article was to highlight large-volume data sources to monitor local physical, economic, and social conditions in a timely fashion while also meeting other public health surveillance data standards, including representativeness and temporal data quality consistency.6 The study was originally commissioned by the Robert Wood Johnson Foundation and led by researchers at New York University’s (NYU’s) Grossman School of Medicine and Global School of Public Health during January through April 2021 to inform the National Commission to Transform Public Health Data Systems. Because the science of small-area data has unique challenges, we also summarize state-of-science approaches for small-area estimation, including when data are of insufficient volume, as well as methods for analyzing unstructured data and for presenting data to meet needs of local stakeholders.
To inform our work, we conducted a rapid horizon scan to synthesize information on resources that could be harnessed for public health surveillance.7 We performed a scoping review of published literature to identify promising metrics, explored online resources, and conducted interviews with experts to explore the perceived salience of identified metrics and identify additional promising data sources. Findings from the literature review, Internet scan, and interviews were synthesized and presented. We focused our scan on 3 categories:
• Physical environment (climate, ecology, land use, the built environment, air quality, etc.),
• Economic environment (economic stability, employment, financial credit, spending, etc.), and
• Social environment (community wellness, social cohesion or connectedness, overcrowding, daily patterns of mobility, housing, education, social media usage, population distributions, etc.).
The following 2 questions guided our review:
1. What measures of exposure to physical, economic, and social environments could potentially be incorporated into routine public health surveillance that are temporally and spatially granular? We defined temporally granular to mean measures available within 1 year of collection or capture and spatially granular as measures available at spatial levels smaller than county.
2. What are the most promising spatial methods and tools to access, analyze, and parse large or high-velocity data streams? Information on metrics was extracted using a standardized extraction form (see “Additional Methods,” available as a supplement to the online version of this article at https://ajph.org, for additional methods, search terms employed, and consort flow diagram).
To complement the scoping review, we identified 7 senior experts from academia, local and federal government, and industry to provide opinions on data sources and metrics as well as to give feedback on preliminary measures identified in the literature review. We used unstructured telephone interviews to explore what types of data on environmental conditions can and should be harnessed for public health surveillance, as well as any challenges experienced using new, high-velocity data sources or related metrics.
We identified many data sources that met 1 of our 2 core criteria—either temporal or spatial granularity. The number of data sources meeting both criteria was smaller. For a complete listing of data sources reviewed, see Table B (available as a supplement to the online version of this article at https://ajph.org).
GRANULAR MEASURES
We describe some of the sources of timely, local data for the physical, economic, and social environments, as well opportunities and actors involved in generating them.
Physical Environment
Perhaps the fastest-evolving area with respect to timely and spatially granular data is the physical environment, because, in part, of online access to urban planning administrative records and increasing availability of remote sensing technology (satellite imagery, aerial photography). The global coverage of satellite remote sensors and a stream of efforts to translate raw data into curated, publicly available data sources has wide-ranging potential for public health.8,9 Different satellite bands are being used to capture different measures of the environment. For many years, air pollution data from satellites have been combined with data from monitoring stations to generate small-scale air pollution exposure assessment data.10–13 However, only in the past decade has the National Aeronautics and Space Administration (NASA) made major efforts, in partnership with academic institutions and other government agencies, to provide greater public access to near-real-time data on air quality and other satellite-derived measures relevant to health, most notably through the NASA Health and Air Quality Applied Sciences Team.14
Several NASA Web sites providing physical environment data offer a glimpse of a possible future state of public health surveillance. For example, NASA Giovanni allows users to download or interactively analyze gridded data online using flexible platforms.15 Options allow for time averaged or time-series data with user-defined dates of interest, as well as user-drawn geographical areas. However, much of these data are not yet optimized for health stakeholders. For example, ozone data are raw column satellite data, whereas health stakeholders require data combined with ground monitor and weather data to capture ozone measures associated with poor health outcomes. Google Earth Engine, another aggregation resource, provides 30 years of historical imagery on surface temperature, climate, land surface, weather, and more. Although designed as a resource for researchers, this Web site provides opportunities for development of public health surveillance‒ relevant tools and metrics.
Box 1 shows an extended list of potential physical environment surveillance metrics obtainable from satellite data. These data are not yet used by many local health departments or community stakeholders, in part because of limited local technical capacity and because the data are not integrated with other heath data. However, several compelling examples of academic‒health department collaborations demonstrate their potential for timely, spatially granular surveillance metrics.20 New York State health department scientists used the NASA-sponsored North American Land Data Assimilation System to examine granular temperature data and compare it with health outcomes across New York. Findings showed that adverse health effects occurred at less-extreme temperatures than initially thought, prompting officials to reduce the heat advisory threshold in 2018 from 100°F to 95°F. In a NASA‒citizen scientist initiative, trained citizen scientists in Florida helped detect, forecast, and target responses to harmful algal blooms on the Florida coast using video and satellite data, producing 1 or 2 daily forecasts for beaches along the Florida Gulf Coast.
BOX 1.
Measure | Data Source | Temporal Availability | Spatial Availability |
Physical environment | |||
Air pollution, blue space, water quality, and coastlines16,17 | Moderate-resolution imaging spectroradiometer | Updated on an ongoing basis | 250-m to 1-km spatial resolution |
Ozone | Center for Spatial Science and Systems, George Mason University | Available from 2018 through 2021 | 12-km spatial resolution |
Greenness18 | Advanced Very High Resolution Radiometer (AVHRR) | Available from 1979 through 2019 | 1.1-km multispectral data |
Heat, urban heat islands19 | US Geological Survey EarthExplorer Landsat | Updated on an ongoing basis | 30-m spatial resolution |
Economic environment | |||
Unemployment | US Bureau of Labor Statistics, local area unemployment statistics | Monthly and annual employment, unemployment, and labor force data | Counties, metropolitan areas, cities with population > 25 000 Neighborhood or census tract information not available |
Personal bankruptcy | US district courts Public Access to Court Electronic Records (PACER) database InfoUSA/Data Axle | Quarterly updates of past-12-mo period | County-level zip-code purchasable geographies (e.g., census tract or other) |
Community credit insecurity index | Federal Reserve Bank of New York (combined American Community Survey and Equifax data) | 2018 currently publicly available quarterly through 2021 | County-level (available) city-level census tract (for cities with population > 50 000) |
Social environment | |||
Foot traffic social distancing metrics | Safegraph | Weekly social distancing metrics (Jan 2019‒Sep 2020) | Census block group, tract |
More recently, NYU Grossman School of Medicine partnered with researchers at George Mason University to generate fine-scale (12-kilometer grids) measures of ozone and particulate matter (PM)2.5, leveraging a new high-resolution air pollution prediction system based on the Weather Research and Forecast model, the Community Multiscale Air Quality model, and ground-level monitors from the Environmental Protection Agency Air Quality System. Beginning in March 2022, these data updated through the end of 2021 will be featured and routinely updated on the City Health Dashboard, a publicly available data access Web site providing more than 35 measures of health and its drivers for more than 750 US cities with populations greater than 50 000. Figure 1 demonstrates spatial variability in annual maximum ozone values and the importance of temporal granularity attributable to seasonality.
Economic Environment
Economic conditions have a strong influence on the health of individuals and communities. The COVID-19 pandemic’s economic fallouts have highlighted an urgent need for more timely economic measures at the neighborhood level. Many local public health agencies and community organizations routinely access county- and tract-level income and unemployment data from the US Census Bureau and American Community Survey (ACS). For census tracts, the ACS provides 5-year averaged estimates with a 2-year lag (e.g., in 2021, 5-year estimates from 2015 to 2019 were available). The Urban Institute provides a city-level Financial Health of Residents dashboard for 60 cities, but the Web site uses ACS and credit bureau data that are lagged by at least 2 years.
In May 2020, in response to the pandemic, Opportunity Insights researchers at Harvard University developed an online dashboard called Economic Tracker to monitor the economic impacts of COVID-19 on communities. The Web site offers near–real-time data (within 2–3 weeks) on consumer spending, small business revenue and openings, and unemployment claims for states, counties, and metro areas.21 Data are presented alongside COVID-19 case, death, and vaccination data. Weekly data summaries are compiled in partnership with several private companies that sell subcounty data. The public-use Web site does not present subcounty data, but academic papers from this team include zip-code measures.22 Other publicly available data sources exist for timely and subcounty economic data, described in Box 1. Figure 2 displays a comparison of recent unemployment figures for select subcounty cities. In this example, Gary, Indiana, is compared with all other cities with a similar population range (75 000‒90 000) and comparably low non-Hispanic White population (10%–20%) in the United States.
US Census Bureau and Internal Revenue Service data sources hold future promise for timely neighborhood metrics, although publicly they only provided data through 2018 at the time of writing this article. These include job totals by census block information on business establishments and employment at national and various subcounty levels and adjusted gross income and related tax information at the zip-code level. Several US Census Bureau initiatives were launched mid-2020 to provide more timely information on the US economy in response to the pandemic, including pulse surveys and weekly updated business formation data, but unfortunately these lack local data.24
The pandemic-associated economic crisis also catalyzed academic‒health department collaborations to assess local economic impacts of the COVID-19 pandemic. For example, researchers at 5 universities partnered with the State of Illinois in a grant-funded initiative to use near–real-time data to examine weekly unemployment rates, replacement rates (ratio of unemployment insurance benefits to average weekly wage in 2019), and consumer spending for 18 counties, spanning January to June 2020.25 They published an in-depth analysis by August 2020 showing a massive drop in consumer spending and large spike in unemployment. Data sources were weekly, state-specific unemployment insurance claims and wage records, as well as spending data from the Opportunity Insights Economic Tracker.
Social Environment
The third environmental domain, “social environment,” is perhaps the most heterogeneous, generally referring to social conditions (e.g., overcrowding, racial residential segregation), as well as measures on education, housing (affordability, insecurity, eviction), broadband access, transportation behavior, neighborhood social cohesion, and community-level measures of social media usage. The ACS has been an important data source for many of these measures, and, as outlined earlier, subcounty (census tract) metrics are 5-year averaged estimates with a 2-year lag.
Beyond the ACS, several initiatives have yielded new data tools and resources on social conditions for audiences nationwide, yet most of these lack either temporal timeliness or spatial granularity. For example, online resources have been developed to track housing-related metrics such as evictions and subsidized housing. These data sources are lagged by more than 2 years. Urban planning advocacy organizations have played a leading role in improving data on transit connectivity and access. For example, Center for Neighborhood Technology has developed AllTransit, a database tracking numerous public transit metrics, including use, routes within one half mile of households, and jobs accessible within 30 minutes; data also lag more than 2 years. Awareness of access to broadband Internet as an SDOH has grown during the COVID-19 pandemic, as many workspaces and schools shifted to remote operation. Broadband providers file data with the Federal Communications Commission twice a year. Data are available at the census tract or block level, allowing for tracking of residential service connections per 1000 households. However, the most recent public data available are from 2018.
Most recent innovations to produce timely, local social environment data fall into 2 arenas: (1) social media usage and content and (2) mobility data using geo-mobile device information. Data collected via Internet usage and social media sites have been used for public health surveillance for more than a decade. Early examples include the creation of Google Flu Trends.26,27 Many publicly available social media data tools provide limited geographic information, because of privacy restrictions or user-restricted geolocation data. Twitter is frequently used for digital public health surveillance, mainly because Twitter allows public access to a 1% random sample of Tweets. While Twitter provides its users with the option to “geo-tag” a tweet as it is posted, only a small number of Tweets are precisely geocoded (< 2%). These data are sometimes used with other data sources to monitor mobility patterns during outbreaks.28
Nonprofit organizations such as Digital Epidemiology Lab have developed health trend‒tracking tools, like Crowdbreaks, using tweets with keywords potentially related to specific health topics.29 Google’s symptom search trend database includes aggregated, anonymized search trends for more than 400 symptoms and health conditions, and includes US county-level trends beginning in 2017. Facebook has also launched a “Data for Good” Web site, which provides a social connectedness index at the county level, measuring the frequency and density of social media ties.30
The rapid development of communication technologies, combined with the data from Global Positioning System devices in mobile phones, has propelled the science of tracking human mobility. Even before the COVID-19 pandemic, these data were being used to examine commuting patterns, commercial activity, and community connectedness.31 In response to the pandemic, several Internet technology companies, including Google and SafeGraph, rapidly developed online community mobility databases with measures updated weekly. Other communication technology companies make similar data sets available to researchers for purchase. These metrics have been used by local governments to assess resident mobility and recovery indicators, such as shelter-in-place behavior, foot traffic to points of interest, and more. Privacy policies often limit the availability of public data sets to county levels, yet more granular data are available upon request.
GENERATING AND REPRESENTING SMALL-AREA DATA
While the potential for “Big Data” to provide rapid information about communities is growing, few big data sources are currently free, are easily accessible, and require minimal additional manipulation. Additional analytic tools are needed to model data to smaller spatial boundaries. Translating and representing those data in ways that meet stakeholders’ needs requires flexible estimation and mapping tools. Here we briefly describe some important methods and innovations to characterize the physical, economic, and social environments.
Small-Area Estimate and Modeling Approaches
When large data sets are not sufficiently granular to provide precise data specific to small geographic areas, statistical modeling innovations now enable researchers to generate increasingly precise small-area estimates.32,33 Small-area estimation methods can be broadly categorized as design-based, model-assisted, and model- or algorithm-based. In design-based methods, statistical properties of measures to be estimated are generated directly from the distribution of data.34 Other auxiliary information can also be integrated when using a model-assisted approach. In a recent, compelling example of this method, researchers combined sparsely available survey data with satellite image data to estimate granular spatial distributions of poverty, which they used to enhance traditional census data measures.35 Design-based methods can suffer when samples are small and cannot always address inconsistencies in data (e.g., if the data collection or satellite image features related to poverty differ by place).36,37
Model-based methods assume that sample observations are realizations of random variables that satisfy some underlying model,38 which requires more assumptions. This method has been applied by researchers at the World Bank to generate robust small-area measures of poverty and income inequality for several low-income countries, accomplished by combining both census and survey data via regression models to generate estimates for subpopulations one one-hundredth of the size that the original surveys would allow.39 In general, model-assisted methods based on statistical learning techniques are being used to capture complex relationships including kernel methods, splines, neural networks, and others.
Recently, algorithm-based methods have also become popular. These approaches build on model-based approaches by designing algorithms that map observed data to corresponding data to be predicted within a spatial area. Researchers then tune the algorithm underlying the model using a training data set so that it “learns” to successfully predict observed data, while other data are withheld for future validation and prediction. For example, Australian researchers generated small-area estimates of household poverty and financial stress by using probabilistic methods to borrow strength from reliable census data and then reweight samples from a national survey.40
Mapping Conditions Using Natural Boundaries
Modern analytic methods now also allow for the creation of estimates that are dynamic in their “localization” (i.e., transcending traditionally defined areal unit boundaries). For example, machine learning empowers researchers to model data into flexible functional forms that can then be leveraged to cluster similar areas by location, geographic resolution, privacy, and properties of the disease condition being modeled using a type of artificial neural network called self-organized maps.41 This method can avoid statistical bias that occurs when aggregating point-based data into administrative units such as zip codes (known as the modifiable area unit problem).42 Other computing and user design technologies make it possible for areas of interest to be defined in real time by users. Design-based approaches can then be used to weight data and generate new estimates for the selected area.
Machine Learning
Granular surveillance metrics can be derived from data sources initially produced for other purposes using machine-learning methods. For example, satellite images can be used to identify physical environment attributes such as green spaces, but only after key attributes are identified from images and assigned a label.43,44 Machine learning methods such as Gaussian processes have also recently been applied to create representations, at a specific temporal frequency or by location, because of their flexibility and ability to deal with missing data, especially in health-related measures.45,46
Methodological and Privacy Considerations
Other statistical challenges must be addressed when aggregating geo-located data from large-volume data streams. Data collected for commercial purposes, in particular requiring Internet or mobile app tools, represent self-selected population subgroups, making it difficult to know which groups are and are not well-represented by these data.47 Results can thus be misleading and even damaging if surveillance under- or overdetects important problems. Advancing the science of bias adjustment to enable valid geographic estimation from large, nonrepresentative data sources is an important methodological area of research.48,49 Privacy concerns that limit the further disaggregation of social or economic data can be addressed by applying methods such as “injecting noise” into data sets, a method now widely used by the US Census Bureau and many private companies.50 Access can also be expanded via protected enclaves or “data safe havens” for researchers to work with granular data and then release relevant metrics on public-use data aggregation sites with limited or no risk of reidentifiability.51
SUMMARY
In this article, we focused on identifying measures available within the past year at the subcounty level as a general rubric that improves upon current standards for measuring SDOH. In general, we found more promising data sources for physical and economic environment measures than for social environment measures. We also identified emerging analytic methods to extend and improve opportunities for small-area estimation, but gaps currently exist between applications of new methods in academic research and private industry and the day-to-day data needs of public health practitioners. Federal and private foundation funding could support relevant applications of these methods to address current data gaps and privacy concerns.
For physical environment measures, global coverage of satellite remote sensors is a cornerstone asset. With multisector activity to curate publicly available data sources and developments in computer science and biostatistics for translating raw images into informative data, these tools and partnerships can have wide-ranging use for public health. The COVID-19 pandemic also stimulated extensive activity by federal agencies and researchers to link data sources for real-time tracking of economic activity, but additional efforts are needed to further disaggregate such data to subcounty levels. Social environment metrics were the most heterogeneous of the 3 categories examined and the realm most impacted by long data lags. Despite a proliferation of data sources, few measures met both criteria of timeliness and spatial granularity except for social media and mobile geo-location data. Partnerships are needed between tech companies (and other data-focused private industries) and public health stakeholders to improve the spatial granularity of existing public-access data sources and generate new relevant measures.
In this work, our goal was to introduce a portfolio of important analytic tools moreso than fully to review them. Findings nonetheless underscore that few local health departments or community stakeholders currently have the capacity to work with diverse arrays of raw data sources to generate timely, accurate environmental determinants of health. In this context, public health‒oriented data aggregation Web sites that allow for download of relevant small-area data are valuable tools, especially when linked to health outcomes data. Several unique challenges also exist in generating small-area data for rural settings, including both statistical and privacy concerns for sparse populations, as well as the need for measures that are distinct from those widely used in urban areas.52
Local governments and community leaders across the country require actionable surveillance data that include measures of the physical, economic, and social environment to identify local public health needs, drive change, and deliver results for local populations. Before the COVID-19 pandemic, the United States was already facing a stagnating trend in average life expectancy and tremendous geographic disparities in health and well-being. The COVID-19 pandemic has further exacerbated economic and social hardship while highlighting deep inequities. These intersecting crises underscore the urgent need for timely, neighborhood-level data on health and environmental conditions to guide resource allocation and shape policies and programs for at-risk communities.
ACKNOWLEDGMENTS
External funding for this work was provided by 2 Robert Wood Johnson Foundation (RWJF) grants: (1) Public Health Surveillance 3.0: Harnessing Novel Environmental and Economic Data and Methods to Guide Public Health Practice, RWJF project number 78287, PI: L. E. Thorpe, and (2) City Health Dashboard: A National Resource for Urban Health Improvement, RWJF project number 78440, PI: M. N. Gourevitch. Additional support for L. E. Thorpe was provided by a Centers for Disease Control and Prevention grant (U48DP006396).
Finally, we would like to also thank Neil Kleiman for his feedback on drafts of this article.
CONFLICTS OF INTEREST
The authors have no conflicts of interest to declare.
HUMAN PARTICIPANT PROTECTION
This work was not human participant research.
Footnotes
REFERENCES
- 1.Thacker SB, Qualters JR, Lee LM. Public health surveillance in the United States: evolution and challenges. MMWR Suppl. 2012;61(3):3–9. [PubMed] [Google Scholar]
- 2.US Department of Health and Human Services. 2010. https://www.healthypeople.gov/2010/hp2020/advisory/SocietalDeterminantsHealth.htm [DOI] [PubMed]
- 3.DeSalvo KB, O’Carroll PW, Koo D, Auerbach JM, Monroe JA. Public health 3.0: time for an upgrade. Am J Public Health. 2016;106(4):621–622. doi: 10.2105/AJPH.2016.303063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.DeSalvo KB, Wang YC, Harris A, Auerbach J, Koo D, O’Carroll P. Public health 3.0: a call to action for public health to meet the challenges of the 21st century. Prev Chronic Dis. 2017;14:E78. doi: 10.5888/pcd14.170017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Office of Disease Prevention and Health Promotion, Office of the Secretary, US Department of Health and Human Services. https://health.gov/healthypeople/about/workgroups/social-determinants-health-workgroup
- 6.German RR, Lee LM, Horan JM, Milstein RL, Pertowski CA, Waller MN. Updated guidelines for evaluating public health surveillance systems: recommendations from the Guidelines Working Group. MMWR Recomm Rep. 2001;50(RR-13):1–35. [PubMed] [Google Scholar]
- 7.National Academies of Sciences, Engineering, and Medicine. Safeguarding the Bioeconomy. Washington, DC: National Academies Press; 2020. [PubMed] [Google Scholar]
- 8.De Sherbinin A, Levy MA, Zell E, Weber S, Jaiteh M. Using satellite data to develop environmental indicators. Environ Res Lett. 2014;9(8):084013. doi: 10.1088/1748-9326/9/8/084013. [DOI] [Google Scholar]
- 9.Dietrich D, Dekova R, Davy S, Fahrni G, Geissbühler A. Applications of space technologies to global health: scoping review. J Med Internet Res. 2018;20(6):e230. doi: 10.2196/jmir.9458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Kleeman MJ, Kaufman JD, Ostro B, et al . . Enhanced air pollution epidemiology using a source-oriented chemical transport model. Washington, DC: US Environmental Protection Agency; 2014. [Google Scholar]
- 11.Kloog I, Koutrakis P, Coull BA, Lee HJ, Schwartz J. Assessing temporally and spatially resolved PM2.5 exposures for epidemiological studies using satellite aerosol optical depth measurements. Atmos Environ. 2011;45(35):6267–6275. doi: 10.1016/j.atmosenv.2011.08.066. [DOI] [Google Scholar]
- 12.Lee HJ, Liu Y, Coull BA, Schwartz J, Koutrakis P. A novel calibration approach of MODIS AOD data to predict PM2.5 concentrations. Atmos Chem Phys. 2011;11(15):7991–8002. doi: 10.5194/acp-11-7991-2011. [DOI] [Google Scholar]
- 13.Lindström J, Szpiro AA, Sampson PD, et al. A flexible spatio-temporal model for air pollution with spatial and spatio-temporal covariates. Environ Ecol Stat. 2014;21(3):411–433. doi: 10.1007/s10651-013-0261-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Holloway T, Miller D, Anenberg S, et al. Satellite monitoring for air quality and health. Annu Rev Biomed Data Sci. 2021;4(1):417–447. doi: 10.1146/annurev-biodatasci-110920-093120. [DOI] [PubMed] [Google Scholar]
- 15.The National Aeronautics and Space Administration. Giovanni. Available. 2021. https://giovanni.gsfc.nasa.gov/giovanni
- 16.McCarthy MJ, Colna KE, El-Mezayen MM, et al. Satellite remote sensing for coastal management: a review of successful applications. Environ Manage. 2017;60(2):323–339. doi: 10.1007/s00267-017-0880-x. [DOI] [PubMed] [Google Scholar]
- 17.Georgiou M, Morison G, Smith N, Tieges Z, Chastin S. Mechanisms of impact of blue spaces on human health: a systematic literature review and meta-analysis. Int J Environ Res Public Health. 2021;18(5):2486. doi: 10.3390/ijerph18052486. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Fong KC, Hart JE, James P. A review of epidemiologic studies on greenness and health: updated literature through 2017. Curr Environ Health Rep. 2018;5(1):77–87. doi: 10.1007/s40572-018-0179-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.White-Newsome JL, Brines SJ, Brown DG, et al. Validating satellite-derived land surface temperature with in situ measurements: a public health perspective. Environ Health Perspect. 2013;121(8):925–931. doi: 10.1289/ehp.1206176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.The National Aeronautics and Space Administration. 2018. https://science.nasa.gov/earth-science/applied-sciences/making-space-for-earth/one-health-day
- 21.Chetty R, Friedman JN, Hendren N, Stepner M. The economic impacts of COVID-19: evidence from a new public database built using private sector data. Cambridge, MA: National Bureau for Economic Research; 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Tulier ME, Reid C, Mujahid MS, Allen AM. “Clear action requires clear thinking”: a systematic review of gentrification and health research in the United States. Health Place. 2019;59:102173. doi: 10.1016/j.healthplace.2019.102173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Department of Population Health, NYU Langone Health. 2022. https://www.cityhealthdashboard.com
- 24.US Census Bureau. New census surveys provide near real-time info on households, businesses during COVID-19.; . Available at. 2020. https://www.census.gov/library/stories/2020/04/new-census-surveys-provide-near-real-time-info-on-households-businesses-during-covid-19.html
- 25.Casado MG, Glennon B, Lane J, McQuown D, Rich D, Weinberg BA. The Effect of Fiscal Stimulus: Evidence From COVID-19. Cambridge, MA: National Bureau of Economic Research; 2020. [Google Scholar]
- 26.Althouse BM, Ng YY, Cummings DA. Prediction of dengue incidence using search query surveillance. PLoS Negl Trop Dis. 2011;5(8):e1258. doi: 10.1371/journal.pntd.0001258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Aiello AE, Renson A, Zivich PN. Social media- and Internet-based disease surveillance for public health. Annu Rev Public Health. 2020;41(1):101–118. doi: 10.1146/annurev-publhealth-040119-094402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Rocklöv J, Tozan Y, Ramadona A, et al. Using Big Data to monitor the introduction and spread of Chikungunya, Europe, 2017. Emerg Infect Dis. 2019;25(6):1041–1049. doi: 10.3201/eid2506.180138. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Müller MM, Salathé M. Crowdbreaks: tracking health trends using public social media data and crowdsourcing. Front Public Health. 2019;7(81) doi: 10.3389/fpubh.2019.00081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Facebook Data for Good. Social Connectedness Index. 2022. https://dataforgood.fb.com/tools/social-connectedness-index
- 31.Kang Y, Gao S, Liang Y, Li M, Rao J, Kruse J. Multiscale dynamic human mobility flow dataset in the US during the COVID-19 epidemic. Sci Data. 2020;7(1):390. doi: 10.1038/s41597-020-00734-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Ghosh M, Rao J. Small area estimation: an appraisal. Stat Sci. 1994;9(1):55–76. doi: 10.1214/ss/1177010647. [DOI] [Google Scholar]
- 33.Jia H, Muennig P, Borawski E. Comparison of small-area analysis techniques for estimating county-level outcomes. Am J Prev Med. 2004;26(5):453–460. doi: 10.1016/j.amepre.2004.02.004. [DOI] [PubMed] [Google Scholar]
- 34.Lavrakas PJ. Encyclopedia of Survey Research Methods. Thousand Oaks, CA: Sage Publications; 2008. [DOI] [Google Scholar]
- 35.Jean N, Burke M, Xie M, Davis WM, Lobell DB, Ermon S. Combining satellite imagery and machine learning to predict poverty. Science. 2016;353(6301):790–794. doi: 10.1126/science.aaf7894. [DOI] [PubMed] [Google Scholar]
- 36.Buil‐Gil D, Solymosi R, Moretti A. Nonparametric bootstrap and small area estimation to mitigate bias in crowdsourced data: simulation study and application to perceived safety. In: Hill CA, Biemer PP, Buskirk RD, editors. Big Data Meets Survey Science: A Collection of Innovative Methods. Hoboken, NJ: John Wiley and Sons; 2020. pp. 487–517. [Google Scholar]
- 37.Tatem AJ, Adamo S, Bharti N, et al . Mapping populations at risk: improving spatial demographic data for infectious disease modeling and metric derivation. Popul Health Metr. 2012;10(1):8. doi: 10.1186/1478-7954-10-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Pfeffermann D. New important developments in small area estimation. Stat Sci. 2013;28(1):40–68. doi: 10.1214/12-STS395. [DOI] [Google Scholar]
- 39.Elbers C, Lanjouw JO, Lanjouw P. Micro-level estimation of poverty and inequality. Econometrica. 2003;71(1):355–364. doi: 10.1111/1468-0262.00399. [DOI] [Google Scholar]
- 40.Tanton R, Vidyattama Y, Nepal B, McNamara J. Small area estimation using a reweighting algorithm. J R Stat Soc Ser A Stat Soc. 2011;174(4):931–951. doi: 10.1111/j.1467-985X.2011.00690.x. [DOI] [Google Scholar]
- 41.Relia K, Akbari M, Duncan D, Chunara R. Socio-spatial self-organizing maps: using social media to assess relevant geographies for exposure to social processes. Proc ACM Hum Comput Interact. 2018;2:145. doi: 10.1145/3274414. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Openshaw S. The modifiable areal unit problem. In: Wrigley N, Bennett RJ, editors. Quantitative Geography: A British View. London, UK, and Boston, MA: 1981. pp. 60–69. [Google Scholar]
- 43.Guo Y, Liu Y, Georgiou T, Lew MS. A review of semantic segmentation using deep neural networks. Int J Multimed Inf Retr. 2018;7(2):87–93. doi: 10.1007/s13735-017-0141-z. [DOI] [Google Scholar]
- 44.Abdur Rehman N, Saif U, Chunara R. Deep landscape features for improving vector-borne disease prediction. ArXiv. 2019;1904.01994v1 [Google Scholar]
- 45.Flaxman S, Wilson A, Neill D, Nickisch H, Smola A.2015.
- 46.Akbar M, Chunara R. Using contextual information to improve blood glucose prediction. ArXiv. 2019;1909.01735 [Google Scholar]
- 47.Chunara R, Wisk LE, Weitzman ER. Denominator issues for personally generated data in population health monitoring. Am J Prev Med. 2017;52(4):549–553. doi: 10.1016/j.amepre.2016.10.038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Wang W, Rothschild D, Goel S, Gelman A. Forecasting elections with non-representative polls. Int J Forecast. 2015;31(3):980–991. doi: 10.1016/j.ijforecast.2014.06.001. [DOI] [Google Scholar]
- 49.Shirani-Mehr H, Rothschild D, Goel S, Gelman A. Disentangling bias and variance in election polls. J Am Stat Assn. 2018;113(522):7. doi: 10.1080/01621459.2018.144823. [DOI] [Google Scholar]
- 50.Abowd JM. Census Blogs: Research Matters. Washington, DC: US Census Bureau; 2018. Protecting the confidentiality of America’s statistics: adopting modern disclosure avoidance methods at the Census Bureau. [Google Scholar]
- 51.Burton PR, Murtagh MJ, Boyd A, et al. Data safe havens in health research and healthcare. Bioinformatics. 2015;31(20):3241–3248. doi: 10.1093/bioinformatics/btv279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Scally CP, Burnstein E, Gerken M.2020. https://www.census.gov/newsroom/blogs/research-matters/2018/08/protecting_the_confi.html