Skip to main content
Springer logoLink to Springer
. 2025 Sep 26;12(1):34. doi: 10.1007/s40572-025-00497-4

Harnessing Geospatial Artificial Intelligence (GeoAI) for Environmental Epidemiology: A Narrative Review

Hari S Iyer 1,, Seigi Karasaki 2,3, Li Yi 4, Yulin Hswen 5, Peter James 4,6,7, Trang VoPham 2,3
PMCID: PMC12474636  PMID: 41003951

Abstract

Purpose of Review

Geospatial analysis is an essential tool for research on the role of environmental exposures and health, and critical for understanding impacts of environmental risk factors on diseases with long latency (e.g. cardiovascular disease, dementia, cancers) as well as upstream behaviors including sleep, physical activity, and cognition. There is emerging interest in leveraging machine learning and artificial intelligence (AI) for environmental epidemiology research. In this review, we provide an accessible overview of recent advances.

Recent Findings

There have been two major recent shifts in geospatial data types and analytic methods. First, novel methods for statistical prediction, combining geospatial analysis with machine learning and artificial intelligence (GeoAI), allow for scalable geospatial exposure assessment within large population health databases (e.g. cohorts, administrative claims). Second, the widespread adoption of smartphones and wearables with global positioning systems and other sensors has allowed for passive data collection from people, and when combined with geographic information systems, enables exposure assessment at finer spatial scales and temporal resolution than ever before. Illustrative examples include refining models for predicting outdoor air pollution exposure, characterizing populations susceptible to water pollution, and use of deep learning to classify Street View image-derived measures of greenspace. While these tools and approaches may facilitate more rapid, higher quality objective exposure measures, they pose challenges with respect to participant privacy, representativeness of collected data, and curation of high quality validation sets for training of GeoAI algorithms.

Summary

GeoAI approaches are beginning to be used for environmental exposure assessment and behavioral outcome ascertainment with higher spatial and temporal precision than before. Epidemiologists should continue to apply critical assessment of measurement accuracy and design validity when incorporating these new tools into their work.

Supplementary Information

The online version contains supplementary material available at 10.1007/s40572-025-00497-4.

Keywords: Artificial intelligence, Geographic information systems, Environmental health, Epidemiologic methods, Big data, Data science

Introduction

Geospatial data and geographic information systems (GIS) are essential tools for environmental health research, including the storage, processing, visualization, and statistical analysis of spatial data [1]. These core functions enable researchers to map areas with high burden of exposure and disease, incorporate spatial structure into models for prediction and causal inference, and inform efforts to identify environmental risk factors for disease [24]. Use of spatial data for environmental exposure assessment [57], disease mapping [6, 810], and planning and resource allocation [11, 12] has been a large and growing area of health research.

Two emerging trends have potential to markedly improve our understanding of how places and environmental contexts may influence human health and health behaviors. The first is the growing practice of combining geospatial analyses using large administrative population and environmental monitoring databases and research cohorts with artificial intelligence-based statistical predictions, or “Geospatial Artificial Intelligence” (GeoAI), which can process and analyze large amounts of geospatial data at scale, reducing researcher time and effort [13]. While some define GeoAI as methods that explicitly incorporate spatial dependence in machine learning algorithms [14], we broadened our review to cover applications of machine learning and artificial intelligence for exposure assessment and environmental epidemiology using geospatial data as predictors. More general overviews of GeoAI applications for health are available elsewhere [1418]; here our focus is on GeoAI applications specifically for studying relationships between place-based environmental exposures and human health. Though GeoAI offers many opportunities to environmental epidemiologists (Box 1), it comes with its own set of challenges, such as understanding the assumptions made by these models and representative data for training and validation. The second trend is increasing volumes of “Big Data” for public health, which encompasses passively collected data from wearable devices smartphone-based GPS and behavioral tracking, and electronic health records [19]. These resources enable researchers to overcome earlier limitations of sample size and reduce costs of data collection. However, these new databases also pose their own challenges, including costs and availability of gold standard exposure data that can be used to train AI algorithms, outcomes, and relevant study features; and ethical implications arising from use of data that could identify an individual without their consent [20].

Box 1 Examples of GeoAI with relevance to environmental epidemiology.

Classifying and predicting environmental features using satellite-derived remote sensing: Satellite-derived Remote sensing data have used to train GeoAI methods for classifying land cover types or environmental features over differing spatial resolutions [21], identifying objects within an image [22, 23], cleaning remote sensing images (e.g. removing noise or errors introduced by atmospheric factors that interfere with sensors) [24], and identifying specific times or periods when the greatest change in environmental features is observed [25]

Urban planning-related contextual environment: GeoAI approaches are being developed to integrate multiple streams of data generated by cities to improve lives of residents. For example, GeoAI is used by rideshare companies and public transportation authorities to more efficiently address demand by adjusting driver-passenger matching and traffic signals, respectively [26, 27], which could also reduce traffic-related pollution. GeoAI can predict areas at greatest risk of heat-related health issues by modeling the urban heat island effect [28]. Lastly, GeoAI is being used for sustainable energy consumption by adjusting power use to meet demand while reducing waste [29]

Earth systems analysis: Building on earlier process-based Earth System Models, GeoAI have improved accuracy of short- and long-term climate predictions through integration of satellite-derived remote sensing data [30]. GeoAI has also been used to improve monitoring of greenhouse gas emissions through facility-level high-resolution mapping of carbon emissions [31]. In order to prevent harms from natural disasters, GeoAI is being incorporated into early warning systems for pollution and other extreme weather events [32, 33]

Here, we describe the growing use of GeoAI for environmental epidemiologic exposure assessment and research. We then discuss how GeoAI applied to data from passive collection of health and behavioral data are capturing relevant exposures across an individual’s activity space, rather than just from their residence as was done in earlier eras. We discuss how geoAI approaches could contribute to recent advances in environmental epidemiologic methods and mixtures, along with ethical considerations required by investigators.

Acquiring and Preparing Geospatial Data for GeoAI

GeoAI approaches require high-quality input data. Because specific geospatial data sources and data types have already been reviewed extensively in the context of public health research [13, 3438], here we summarize considerations when acquiring and preparing input data sources for GeoAI applications in environmental epidemiology and exposure assessment focusing on the US, although most types of datasets described are available globally (Table 1).

Table 1.

Major geospatial data used for GeoAI research with US examples

Data source What is collected Who collects Types of environmental or contextual factors Time period and frequency Spatial scale Example application of GeoAI with data source Quality considerations Example
Administrative Disease registries, environmental monitoring, population demographics and economic factors Government (local, national) Socioeconomic factors (education, income, occupation), racial composition, poverty, air or water pollutants

1860-present

Usually updated between 3–10 years

US: census block group, census tract, ZIP code tabulation area, county Validate socioeconomic status measures estimated from Street View images of car make and model Error in self-reported information, uncertainty, missing or erroneous percentages, harmonization of boundaries over time National Census, American Community Survey, Demographic and Health Surveys
Satellite-derived Remote sensing Satellite images, electromagnetic spectrum, aerosol optical depth Government (national) Land cover, greenspace, outdoor light at night, air pollution, aircraft imagery, building attributes, temperature

US: 1985-present

Usually updated annually

Varies (30 m – 1 km are most common) Feature detection and forecasting using self-attention models, enhance low-resolution satellite images, learning from unlabeled data Accounting for surface reflectance, removing cloud cover, mosaics to stitch complete data Landsat, MODIS, Copernicus
Street View Image Panoramic images taken along road networks Businesses and volunteers Greenspace, neighborhood disorder US: Varies (since 2007 for Google Street View) Point locations Classifying types of green space as perceived by people walking May only be collected at a single time point, Google Street View, Baidu, Mapillary (crowdsourced)

Administrative, satellite-derived remote sensing, and street view data are major sources of geospatial data available to environmental epidemiologists to characterize exposures. These sources are attractive for use in GeoAI because they offer extensive geographic coverage, raw data often are available at fine (< 1 km) spatial scales, and data are available over a long period of time (over twenty years in many cases) with annual or more frequent updates. Given the granularity of spatial and temporal data offered, these databases capture enormous variation in place-based characteristics that can be exploited for GeoAI methods to characterize social, built environment, and pollution exposures [16, 3942].

Prior to applying GeoAI methods using these data sources, researchers should reflect on the completeness and accuracy of information contained therein. In addition, predictors may need to be normalized or transformed before applying GeoAI algorithms [4346]. If predicting exposure by applying GeoAI algorithms in a novel dataset is the goal of the study, high quality labeled feature data are required. Certain databases may require further pre-processing. For example, satellite-derived remote sensing data often contain unusable or missing information due to cloud cover or other atmospheric anomalies that must be corrected using specialized algorithms [47, 48].

Researchers should also assess the appropriateness of a given geospatial database for their GeoAI study based on how data were sampled. Government, business, and volunteer-based sampling offer strengths and weaknesses in terms of coverage and precision. While government databases offer complete coverage of a population of interest, errors related to reporting and completeness may be more likely. Businesses may offer higher quality data, but may be smaller in scale and more costly to access. Volunteer-based data sources may offer more granularity than either businesses or government, particularly for select populations or geographies, but lack generalizability. Thus, epidemiologists should exercise judgment regarding the internal and external validity of input data sources and quality when determining an appropriate geospatial data source to use in a GeoAI algorithm [49].

Narrative Review Methods

Our goal in this narrative review was to provide the reader with an overview of key concepts and emerging trends in the use of GeoAI in environmental epidemiologic and exposure assessment research, rather than providing a comprehensive review. We began by identifying papers in the past three years that applied deep learning and mobile phone-based big data approaches for assessing relationships between residential environment, pollutants, and lifestyles in clinical, cancer, and environmental health journals. We began with a focus on cancer because the use of geospatial analysis approaches has been a major research and funding priority by the US federal government in recent years [50].

In order to supplement our initial search, we obtained a list of peer-reviewed journal articles covering topics relevant to GeoAI and environmental health from PubMed using the “Advanced Search” tool with the search terms (geospatial artificial intelligence health) AND (environment), restricting to English language studies only. We then screened paper titles and abstracts to confirm that papers included in the review focused on environmental exposure assessment, human health, and application of statistical methods that incorporated spatial analysis and Machine learning or artificial intelligence-based methods to predict exposure, health outcomes, or relationships between the two. We excluded studies that were out of scope, reviews, abstracts, or did not focus on relevance to Human Health. Of 85 results, 43 were ultimately retained. This keyword search was complemented by snowball sampling of articles cited in references. These retained papers are summarized in Supplementary Table 1. Included papers focused predominantly on exposure assessment for air pollution and water quality (29/43, 66%), and compared different AI and machine learning approaches against one another using R2, Area Under the Curve (AUC), and Root Mean Squared Error (RMSE). The most common methods were eXtreme Gradient Boost (XGB) and Random Forest (RF), which generally produced the best predictions of exposure.

GeoAI Combined with Big Data for Environmental Exposure Assessment

For most environmental exposures, individual-level exposure measurement would be prohibitively expensive in large populations over long time periods. GeoAI is well-suited for exposure prediction, as it integrates spatial modeling and theory with large datasets and predictive algorithms to estimate where and for whom environmental exposures might pose the greatest risk or greatest benefit. When trained with high quality, representative data, deep learning algorithms can accelerate the laborious task of exposure assessment [5153]. Geospatially derived exposures of air pollution, water pollution, and features of the built environment can be developed using geospatial AI approaches, and then used to map areas of high or low risk based on pollutant exposure and health [7] (Fig. 1).

Fig. 1.

Fig. 1

Offers a visual illustration of how GeoAI and Big Data sources are beginning to be used to advance environmental exposure assessment

Enhancing Predictions of Ambient Air Pollution

Air pollution has been a major focus of GeoAI exposure modeling for environmental exposure assessments in epidemiologic studies [43, 48, 5468]. Utilization of GeoAI methods has demonstrated improved predictive performance compared to earlier air pollution exposure models that relied on ground monitors only [69]. Exposure models have been developed for many air pollutants such as nitrogen dioxide [70, 71], ozone [57, 59], and ammonia [46] (Supplementary Table 1). Here, we focus on GeoAI methods for fine particulate matter (PM2.5), the most widely studied air pollutant [72].

Using GeoAI offers potential benefits over classical methods for air pollution prediction. First, these methods are able to account for complex characteristics of atmospheric mechanisms, including nonlinearity, interactions, and spatial and temporal autocorrelation [73], which can lead to better performance. Second, these models can be trained using publicly available satellite aerosol optical depth measures and low-cost local sensors and scaled across large populations and geographic areas, enabling studies of air pollution and health globally where government monitoring may be limited [43, 56]. While earlier statistical methods focused on outdoor air pollution only, studies are beginning to leverage available data to obtain estimates of exposure when traveling or when indoors [63]. The aforementioned GeoAI PM2.5 exposure models are associated with high predictive performance (cross-validation R2 ranging from 0.73–0.89) [7376]. Further, these models have been used in epidemiologic studies that have contributed critical knowledge regarding the health effects of PM2.5 [77, 78]. Yet limitations of these methods include the requirement of large numbers of variables for prediction that may not be available for all study areas or time periods and high computing resource needs.

Di et al. (2016) developed a neural network-based hybrid model integrating satellite-derived remote sensing data on aerosol optical depth (AOD), chemical transport model outputs, land use, and meteorological variables [73]. Convolutional layers in the neural network were used for PM2.5 modeling to aggregate variable values from nearby grid cells or monitoring sites, which enabled the learning algorithm to determine the relative importance of variables while accounting for neighboring influences and potential interactions. Hu et al. (2017) created a random forest model incorporating geospatial data on predictors of PM2.5 concentrations not included in other exposure models such as percent impervious surface [74]. Compared to a neural network approach that involves a two-stage structure for training, a random forest approach implements a simpler one-stage structure. Di et al. (2019) established an ensemble PM2.5 exposure model integrating three machine learning algorithms (neural network, random forest, and gradient boosting) that incorporated over 100 variables using a generalized additive model and smoothing functions to account for nonlinear and/or geographically varying relationships [75]. Integrating PM2.5 predictions from multiple machine learning models allows for final predictions to take advantage of models that perform better in different settings based on location, concentration or time period, leading to overall improved model performance. Beyond the U.S., Shen et al. (2024) have optimized a deep learning residual convolutional neural network for global estimation of PM2.5 levels, which is able to identify and learn from images (e.g., spatial relationship of PM2.5 with predictors in its surrounding environment) [76].

Relatedly, accurate, reliable predictions of wildfire smoke (or extreme weather and climate events at large [79]) can support resource allocation, preparedness efforts, and early warning systems for vulnerable populations. Researchers have used methods ranging from gradient boosted trees [80] to ensemble-based deep learning [81] to generate high-resolution predictions of fine particulate matter in locations and time periods where direct measurements are unavailable. These approaches have also been used to model the spread of wildfire itself; Shadrin et al. (2024) used deep neural networks to predict wildfire spread [82].

Augmenting Prediction of Drinking Water-Based Contaminants and Communities at Risk

GeoAI has also been used to estimate water quality, contaminants, and pesticides, enabling population-level estimates of exposed populations [45, 8390]. Historically, the development of water pollutant models was labor intensive, requiring detailed maps of water pipes, well calibrated fate and transport physical models, and historical surveys of industrial contamination [91, 92]. Leveraging GeoAI methods can support linkages of water monitoring data to households that are most likely to receive water from a given distribution system, without requiring costly surveys and acquisition of private information about water systems. For example, the US EPA used a GeoAI approach to estimate community water system service areas across the US, leveraging public data on population characteristics and water system infrastructure to predict whether a given census block was more likely to be served by a water system or to procure its drinking water through other private means (e.g., domestic wells). Water system IDs were used to group together neighboring blocks predicted to be served by public water systems. For census block groups with multiple water system matches, they used random forest models to predict the most likely water system serving the majority of the census block group. Model validation using a test and train dataset found very high concordance (AUC = 0.9997 [93]).

GeoAI has also been used to identify areas where contamination is most likely to occur. Emerging contaminants, such as per- and polyfluoroalkyl substances (PFAS), pose significant challenges for drinking water management and provision and health [9496]. With tens of thousand classes of PFAS, sampling and monitoring has proven to be both technically and financially difficult at scale. A recent study by Tokranov et al. (2024) used eXtreme Gradient Boosting (XGBoost), a tree-based machine learning approach, to predict the probability of PFAS detection in groundwater across the contiguous U.S. with an AUC = 0.83 [97]. The researchers trained their model using historical water quality data, the spatial distribution of known PFAS sources, and environmental and hydrogeological predictors, and validated their predictions using both k-fold cross validation and independent test datasets. A limitation of these studies is that sampling data are clustered in coastal areas and in Appalachia, with limited sampling in the central U.S.

Even long-term regulated contaminants like lead present significant monitoring challenges [98]. This is because lead contamination can be tested for and treated at the point of distribution, but contamination can occur between distribution and the point of exposurer [99]. Recent research has focused on predictive models to identify areas at high risk for lead contamination past the point of distribution. Studies by Huynh et al. (2024) applied GeoAI approaches for estimating lead exposure from drinking water in schools [100], with Huynh et al. applying microsimulation modeling approaches to predict potential high-risk levels in children through linkages with sociodemographic information from the US Census. Hajiseyedjavadi et al. (2020) investigated homes with high lead concentrations in tap water [101], applying spatial cross-validation in their machine learning algorithms to predict lead levels, which reduced potential bias in predictions arising from non-uniform spatial sampling of points.

Improving Built Environment, Greenspace and Pollutant Exposures Using Satellite-Derived Remote Sensing

Satellite imagery captures a variety of information, including visible features of land (built environment, settlements, greenspace) and atmospheric elements (chemical composition of atmosphere, spectral bands of light). The breadth of built and contextual environmental factors captured can, when combined with GeoAI methods, allow for improved disease prediction and inference regarding drivers of morbidity in different geographic settings [62, 64, 102108]. GeoAI enables use of social, built environment, land cover, meteorological, and environmental monitoring data to be used to assess correlations with different disease types.

An important application of these Big Data satellite instruments is to study potential impacts of exposures to nature (e.g., spending time in forests and parks) and health [109111]. In the Nurses’ Health Study (NHS) and Nurses’ Health Study II (NHSII) prospective cohorts, Landsat-derived greenspace was appended to participants’ addresses in prospective studies of systemic inflammation [112], cognitive function [113], and depression [114], suggesting potential health benefits of residing in neighborhood environments with higher exposure to vegetation. Satellite-derived measures of greenspace from Landsat satellites were linked to residential ZIP codes among US Medicare Claims beneficiaries, revealing possible inverse associations with cardiovascular disease hospitalizations and possible increases in respiratory disease hospitalizations in urban areas [115]. Residential ZIP code-level greenness was associated with lower rates of Alzheimer’s Disease and related dementia hospitalizations [116].

Sentinel satellite data on air pollutants offers scientists in resource-limited settings with relatively weaker governmental regulation of environmental air pollutants to monitor changes in pollution associated with health outcomes. For example, studies in India [117] and Iran [118] demonstrate how academics are beginning to leverage Sentinel data to identify areas and communities who may be at heightened risks of high air pollution exposure, demonstrating that with sufficient technical capacity in use of GeoAI techniques, epidemiologists can produce estimates of health impacts of air pollution more efficiently.

Detecting Street-Level Features of the Natural and Built Environment

While use of satellite-derived exposures has advanced the study of environmental exposures and health, they may not accurately reflect an individual’s interaction with their environment [119]. This limitation may be overcome through use of ubiquitous street-level images, which provide panoramic views of a point location that better captures an individual’s perception of their environment [107, 120123]. Deep learning techniques, such as convolutional neural networks, can be applied to street-level images to identify environmental features such as greenspace, building density, and physical disorder that may impact health [124]. Outputs of these deep learning techniques can be used to detect built environment features, such as trees, grass, sidewalks, and urban disorder from eye-level views [124], which may be more strongly correlated with individuals’ own perceptions compared to residential satellite-derived measures [125]. Emerging analyses that apply street-view exposure metrics could further advance the field by elucidating, for example, the specific type of greenspace features that may drive health behaviors, mental health, and disease outcomes.

Street-level environmental exposure assessment through deep learning applied to street-level images have been used to study health effects of visible greenspace. For example, Yi et al. examined associations of different street-view greenspace components with adiposity measures (landscaping items such as flowers and plants) and cardiovascular health (trees), respectively, in Project Viva children—an eastern Massachusetts-based cohort [125, 126]. Yi et al. further examined over 350 million street-level images and found street-view trees were associated with lower incident depression over 17 years of follow-up [127]. Lastly, street-view trees were also found to be associated with a lower risk of Parkinson’s Disease among 45.6 million Medicare beneficiaries across the US [128]. Nguyen et al. (2019, 2021) [129, 130] applied computer vision techniques to over 16 million street-level images and found area with limited ground-level greenspace infrastructure exhibited higher rates of obesity and diabetes. These studies demonstrate how combining GeoAI with Big Data from Street View images can advance earlier work using satellite-derived greenspace measures, revealing specific forms of greenspace (e.g. street trees) that may offer the most benefits for health.

Novel Big Data Sources for Acquiring Population Health Behavior Data

Passive Data Collection to Describe, Map, and Predict Environmental Burden

Low-cost personal PM2.5 monitors [131], such as the Airbeam by Habitat Map (a community-based organization in Brooklyn, New York), are portable, able to track individual-level exposures over time, and enable easy data-sharing with researchers or community members [43]. Over 3 billion measurements have been taken using Airbeam. Users may contribute their monitoring data to public maps, which can supplement government monitoring databases, and be leveraged for communities to advocate for regulations [132, 133]. The Smoke Sense citizen science program of the EPA is another example of community-based monitoring and education regarding wildfires and ambient air pollution [134]. Users download an app and can test knowledge regarding protective behaviors, share data on pollution in their environment, and receive smoke and air quality alerts. These programs suggest that crowd-sourced and community-focused efforts to expand exposure monitoring can allow for population-based sampling, which may overcome limited generalizability of exposure distributions and effect estimates from occupation-based environmental epidemiology Big Data cohorts.

Consumer Wearable Devices, and Smartphones for passive Health Behavior Data Collection

Researchers are merging spatial datasets with accelerometry and consumer wearable devices, which allow high temporal resolution (e.g., 50 measures per second), objective data on movement, and other data streams, including light sensors and heart rate sensors [135, 136]. These devices can be used to derive validated measures of physical activity, sleep, and heart rate variability. Much of this work originated with research-grade accelerometry, including Actigraph devices [137], which researchers provided to participants for a sampling burst (e.g., one week) to derive health behavior metrics. More affordable consumer wearable options, including the Fitbit and Apple Watch, have enabled longer term follow-up of health behaviors [138]. Recently, more complex algorithms have developed over time to derive meaningful signals from wearable data, such as the type of physical activity or sleep stages These behavioral data can then be linked to environmental exposure data derived via location data collected from study participants (e.g. residential addresses or smartphone GPS data) to examine associations between their spatial exposures and objective health behaviors (Fig. 2).

Fig. 2.

Fig. 2

Traditional vs “Big Data” Cohort Design

Alongside research-grade accelerometry and consumer wearables, widespread use of smartphones has provided another opportunity for intensive data collection schemes, which an increasing number of studies have started using [139141]. Compared to studies that rely on mailing wearables, smartphone-based collection of behavioral data is less burdensome and costly, allowing for longer follow-up periods due to secure uploading of near–real-time passive GPS and accelerometer data [142]. High accuracy smartphone GPS data can be merged with spatial datasets containing information on built and natural surroundings, noise levels, and air pollution to produce customized exposure metrics for environmental factors that change throughout the day over weeks and even months [143146]. The smartphone accelerometer can take precise measurements through efficient sampling schemes that can provide an objective indicator of physical activity by identifying steps and cadence using advanced walk detection algorithms [147]. Smartphone apps can also be programmed to administer surveys on health behaviors or psychological health outcomes [148]. More importantly, these datasets can be linked at different temporal scales and sequences to examine the relationships between environmental and behavioral factors and chronic disease outcomes.

Analyzing Social Media to Assess Responses to Environmental Hazards

The integration of online data sources, including social media and other digital platforms, can provide real-time insights into environmental and health-related exposures [65, 149]. Social media platforms offer rich geospatial data that can be leveraged for exposure assessment. X (formerly Twitter) enables researchers to analyze geotagged posts and monitor real-time environmental hazards. Hswen et al. (2019) demonstrated that geotagged tweets could be used to monitor air pollution exposure by correlating sentiment analysis (use of AI to attribute emotion and public opinion to social media discusions) with air quality index data [150]. Social media often provide real-time information on locations when an acute environmental disaster occurs, and allow government officials to share information with affected citizens, and for communities to share information with one another about the unfolding disaster and to let responders know where resources were needed [151]. Social media can also be studied following a disaster to understand how the public is reacting, and which individuals and institutions are most influential in sharing information [152]. Applying sentiment analysis using GeoAI approaches can reveal useful information for responders, such as when a study following Hurricane Matthew found that damage-related, rather than disaster-related tweets were more important predictors of need [153]. These examples reveal how social media Big Data sources can augment traditional environmental exposure studies by allowing for joint examination of spatial and social context, improving predictions of public understanding of where to find help and resources during a disaster [154].

Beyond X, Google search queries provide real-time assessments of health trends. Sadilek et al. (2020) demonstrated that Google search trends could predict Lyme Disease outbreaks, achieving a 92% correlation with CDC case data at the county level [155], revealing potential for GeoAI to support estimates of health and behavioral data for environmental epidemiologic research. Instagram and TikTok contribute to geospatial exposure assessment through location-tagged images and videos, allowing for evaluations of urban greenery, pollution sources, and built environments [156]. Facebook and Reddit facilitate community-driven discussions on environmental hazards, offering qualitative insights that complement quantitative monitoring systems [157].

Natural language processing and AI algorithms can be applied to social media and other text databases to serve as new sources of environmental exposure data [158]. A group of researchers in Malta were able to use natural language processing to extract features from 100,989 building permits into a geocoded database with building information to assess potential hazards, including year of construction, height and occupancy [159].

Applications of GeoAI for Data Analysis and Interpretation

Handling Spatially Correlated Data

Geospatial data often exhibit spatial autocorrelation whereby observations that are closer to each other are more similar than those that are further away [160]. This spatial dependence has implications for GeoAI algorithms when applied for exposure assessment and for causal inference in environmental epidemiologic studies [161].

When applying GeoAI algorithms for exposure assessment, spatial dependence of input predictors can be encoded into the algorithms themselves, referred to as “spatial embedding or spatial representation learning” [162]. This approach goes beyond the use of standard AI approaches discussed previously (e.g. greenspace in street view discussed in Section 2.4) by incorporating expert knowledge regarding spatial relationships between features that are being used to predict the exposure [163, 164]. For example, recurrent neural networks and transformer-based AI models to predict traffic patterns, an important contributor to air pollution, have been extended to incorporate spatial dependency using latitude and longitude to more precisely estimate changing traffic patterns [165, 166]. Spatial graph neural networks are alternative data models that do not rely on grid-based search, and can be augmented to incorporate expert knowledge regarding geographic and spatial relationships as nodes and spatial constraints in estimating weights when constructing the input graph [163].

Spatial dependence poses challenges to potential outcomes-based causal inference. Environmental epidemiologic studies relying on spatially-derived exposures are susceptible to unmeasured spatial confounding [167]. GeoAI could inform expert selection confounder sets, particularly when applying two-step methods involving prediction of environmental exposure conditional on covariates (e.g. inverse probability of treatment weighting (IPTW) or propensity scores) [168]. These machine learning and AI-based approaches for estimating propensity scores and IPTW could incorporate geospatial environmental and contextual factors, along with location-based parameters [163, 169, 170].

Analyzing Complex Exposure Mixtures, Exposomics

The exposome is defined as the totality of environmental exposures across the life course [5, 171, 172]. Exposomics aims to comprehensively investigate the cumulative effects of physical, chemical, biological, and psychosocial influences that impact biological systems through leveraging data from interdisciplinary methodologies to enable discovery-based analysis of environmental determinants of health [173]. This aim is clearly lofty—GeoAI approaches can assist in reaching this goal by facilitating the analysis of high-dimensional, correlated datasets.

Mixture modeling provides methods to study the independent and aggregate effects of environmental exposures [174176]. For example, Bayesian kernel machine regression (BKMR) models the health outcome as a smooth kernel function of the exposure variables adjusting for confounders, and accounts for collinearity using hierarchical variable selection [177]. There are also other mixture modeling methods including Bayesian Weighted Quantile Sum (WQS) regression and quantile g-computation [176, 178181]. In addition, machine learning methods can be combined with mixture modeling to determine interactions between certain mixtures and/or mixture components. For example, Boosted regression trees can estimate H-statistics to rank the importance of exposure pairs [182, 183]. Signed iterative random forest in conjunction with WQS regression has been used to search for synergistic effects of chemical stressors on autism spectrum disorder [183, 184].Machine learning approaches based on penalization and shrinkage have been proposed to allow WQS to evaluate effects in opposing directions of environmental mixtures [185].

Challenges for exposomics studies, which can examine a large number of environmental exposures, include the lack of standardized approaches for selecting geospatial exposure models for a given environmental factor [175]. Use of AI-based approaches, which incorporate spatial predictors, can improve interpretability of findings and help screen large numbers of candidate exposome variables [186]. Mixture modeling methods are each associated with limitations that should be considered such as sensitivity of the posterior inclusion probabilities to tuning parameters in BKMR and loss of information due to data transformation to quantiles in WQS [187].

Interpreting Model Parameters and Feature Importance

Most epidemiologic investigations seek to make inferences regarding what factors are most important or may cause the outcome of interest. A limitation of AI approaches is that the parameters used to predict exposures or outcomes are less interpretable than standard regression-based approaches [161]. In order to understand which predictors are most important, Shapley Additive exPlanations (SHAP) are commonly applied to examine the magnitude and influence of each geospatial and other predictors on the overall exposure prediction. In most papers we reviewed for air and water quality exposure assessment using AI methods, the authors provided SHAP beeswarm plots and reported the rankings of predictors [45, 46, 54, 57, 58, 67, 85, 149, 188].

Challenges and Limitations when Applying GeoAI for Environmental Epidemiology

Spatial and Temporal Issues

GeoAI tools may excel at predicting environmental phenomena relative to traditional methods, but cannot produce a conceptual model for an epidemiologic relationship with a given environmental exposure. Epidemiologists and exposure scientists must provide guidance regarding the physical and pathological processes that link exposure to outcomes. This requires scientists to continue considering the core principles of time, place, and population when designing epidemiologic studies.

Because environmental exposures and related spatial phenomena are so context-dependent, GeoAI algorithms are likely to perform poorly when applied in different geographic settings and different time periods as argued by Goodchild and Li [161]. Exposure models derived from AI models exhibit variable quality in predictions over larger geographic areas, which may lead to poorer performance in less populous areas and demographic groups [5557, 84]. Some AI models such as recurrent neural networks struggle to model long-term trajectories because of the backpropagation process used in estimating weights [189] though alternative AI algorithms, such as Long Short-Term Memory can address some of these issues [190]. For a given research question, the input exposure data frequency and quality may determine the time period over which GeoAI algorithms can be used to predict exposure histories.

With respect to geographic context, statistical relationships between spatially defined exposures and outcomes are sensitive to the choice of spatial scale, referred to as the Modifiable Areal Unit Problem [191]. Investigators should specify clear conceptual models linking exposures and outcomes to determine the relevant spatial scale. Harmonizing exposure and outcome data collected at different spatial scales may involve areal interpolation that can introduce noise leading to loss of precision [192]. In addition, edge effects at boundaries of geographic areas can cause unstable predictions or introduce bias in epidemiologic studies [193]. In these settings, extending the boundaries using buffers can be used as a sensitivity test.

Ethical Issues

Verifying input data quality for GeoAI is essential, because without human interpretation, decisions made by AI models may Yield unexpected or unwanted outcomes. A recent review of 91 Machine learning algorithms for clinical care, 87% reported bias for a socioeconomically disadvantaged group [194]. Application of an AI-based model within a hospital system exacerbated health disparities by preferentially excluding racialized minority patients and those from low-income neighborhoods from certain health services due to predicted increases in hospital stays [195]. These models may also have less accurate predictions in racialized minority groups [196]. In environmental epidemiologic studies, careful sampling strategies that ensure sufficient numbers of individuals across demographic and socioeconomic strata could enhance external validity of effect estimates and ensure that exposure measures are equally accurate across populations [197].

An assessment of the “bias and fairness” of the algorithm should be included when reporting results. These measures go beyond standard assessments of statistical accuracy such as calibration (how accurate predictions are with respect to observed exposures) and discrimination (how well the classifier algorithm separates exposed vs unexposed). Specifically, evaluations of fairness for AI algorithms reveal for whom algorithms provide more accurate vs less accurate predictions with respect to sociodemographic characteristics. Evaluating accuracy, positive and negative predictive values, and true and false positive rates across different groups can ensure that exposure assessment models are not inadvertently biased [55, 195, 198, 199].

Another ethical consideration of GeoAI-based exposure assessment relates to privacy and confidentiality. Federal regulations strongly restrict access to residential addresses for research because they may identify individual patients. However, displacement of geocodes by introducing random “noise” (geomasking), has been shown to reduce disclosure risks [200, 201]. Efforts have been made to avoid sharing sensitive data while allowing geocoding and spatial linkages to be done where PHI is stored and used for research. Examples include the Decentralized Geomarker Assessment for Multi-Site Studies (DeGAUSS) method by Brokamp et al. [202] AI models may eventually be able to create synthetic populations that preserve statistical relationships from the observed data, which could be analyzed without disclosing patient information [203].

Measurement Error

Measurement error is a common source of bias in geospatial health studies, which occurs due to inaccuracies in geocoding [204], violations of assumption that residential addresses reflect personal exposure [205], and limitations in spatial resolution [141]. Even when methods like mobile GPS approaches are used for exposure assessment, the internal biologically active dose of exposure must be assumed rather than directly measured. GeoAI methods could be applied to evaluate measurement error in environmental epidemiology studies [55]. For example, methods involving machine learning have been developed to estimate the optimal (most strongly predictive) geographic buffer size for capturing the most relevant greenspace exposure on depression [206, 207], and spatial data could improve predictions. In addition, conformal prediction, a machine learning approach to quantify uncertainty of predictions, has been extended for spatial analysis allowing for more nuanced interpretation of AI-based predictions in spatial environmental contexts [208] by providing information about how precise or imprecise the GeoAI-derived exposure assessment may be, and therefore how much measurement error may be introduced when using the GeoAI-derived exposure.

Conclusion

In summary, GeoAI offers exciting opportunities to improve exposure assessment and analysis for environmental epidemiology research. When combined with existing administrative, street view, or satellite data, GeoAI allows efficient exposure assessment and prediction for settings that have previously been difficult or costly to capture, such as perceived built environment and water contamination. GeoAI can also be integrated with passively collected data from wearable devices and high resolution GPS tracking to characterize exposures occurring beyond a participant’s residence, reducing measurement error. Lastly, GeoAI may improve performance of novel mixtures-based analysis approaches, and augment methods to correct for study biases. Alongside the promise offered by GeoAI, researchers should consider potential issues arising from poor quality training data, the timing and spatial coverage of training data and its relevance to the population of interest, and the need to maintain privacy and confidentiality when using participant address information.

Key References

  • Mai G, Xie Y, Jia X, Lao N, Rao J, Zhu Q, et al. Towards the next generation of Geospatial Artificial Intelligence. International Journal of Applied Earth Observation and Geoinformation. 2025 Feb 1;136:104,368.
    • ⚬ This review provides perspectives from the fields of remote sensing and geography of potential for GeoAI to improve classification and prediction of environmental features derived from satellite images.
  • Di Q, Kloog I, Koutrakis P, Lyapustin A, Wang Y, Schwartz J. Assessing PM2.5 Exposures with High Spatiotemporal Resolution across the Continental United States. Environ Sci Technol. 2016 May 3;50(9):4712–21. 
    • ⚬ This paper provides a rigorous example of leveraging GeoAI and related machine learning approaches to improve prediction of fine particulate matter pollution.
  • Montanari A, Fancello G, Sueur C, Kestens Y, van Lenthe FJ, Chaix B. A sensor-based study on the environmental determinants of sleep in older adults. Environmental Research. 2025 Jun 1;274:120,874.
    • ⚬ This paper leverages wearable devices and GPS to study associations of multiple environmental exposures with sleep, and found that neighborhood socioeconomic status was the strongest predictor of sleep quality.
  • Yi L, Hart JE, Roscoe C, Mehta UV, Pescador Jimenez M, Lin PID, et al. Greenspace and depression incidence in the US-based nationwide Nurses’ Health Study II: A deep learning analysis of street-view imagery. Environment International. 2025 Apr 1;198:109,429.
    • ⚬ This study applied a deep learning model to classify street view images within a large prospective cohort study and found that visible trees were associated with lower depression.

Supplementary Information

Below is the link to the electronic supplementary material.

Author Contributions

Conceptualization, funding: HSI, PJ, TVP; Data curation: all authors, writing – first draft: HSI, writing – review and editing: all authors.

Funding

HSI was supported by NIEHS K01ES035734 and NIEHS P30 ES005022. SK was supported by T32 CA094880. LY was supported by UG3OD035533. YH was supported by NIA P01AG082653. PJ was supported by R01 HL150119. TV was supported by K01 DK125612.

Data Availability

No datasets were generated or analysed during the current study.

Declarations

Human and Animal Rights and Informed Consent

This study does not contain any new data from human subjects or animals performed by any of the authors.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Li S, Dragicevic S, Castro FA, Sester M, Winter S, Coltekin A, et al. Geospatial big data handling theory and methods: A review and research challenges. ISPRS J Photogramm Remote Sens. 2016;1(115):119–33. [Google Scholar]
  • 2.Elliott P, Wartenberg D. Spatial epidemiology: Current approaches and future challenges. Environ Health Perspect. 2004;112(9):998–1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Sahar L, Foster SL, Sherman RL, Henry KA, Goldberg DW, Stinchcomb DG, et al. GIScience and cancer: state of the art and trends for cancer surveillance and epidemiology. Cancer. 2019;125(15):2544–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Kirby RS, Delmelle E, Eberth JM. Advances in spatial epidemiology and geographic information systems. Ann Epidemiol. 2017;27(1):1–9. [DOI] [PubMed] [Google Scholar]
  • 5.VoPham T, White AJ, Jones RR. Geospatial science for the environmental epidemiology of cancer in the exposome era. Cancer Epidemiol Biomark Prev. 2024;33(4):451–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wartenberg D. Screening for lead exposure using a geographic information system. Environ Res. 1992;59(2):310–7. [DOI] [PubMed] [Google Scholar]
  • 7.Clark LP, Zilber D, Schmitt C, Fargo DC, Reif DM, Motsinger-Reif AA, et al. A review of geospatial exposure models and approaches for health data integration. J Expo Sci Environ Epidemiol. 2024;6:1–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Nuckols JR, Ward MH, Jarup L. Using geographic information systems for exposure assessment in environmental epidemiology studies. Environ Health Perspect. 2004;112(9):1007–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Lynch SM, Sorice K, Tagai EK, Handorf EA. Use of empiric methods to inform prostate cancer health disparities: Comparison of neighborhood-wide association study “hits” in black and white men. Cancer. 2020;126(9):1949–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Vieira VM, VoPham T, Bertrand KA, James P, DuPré N, Tamimi RM, et al. Contribution of socioeconomic and environmental factors to geographic disparities in breast cancer risk in the Nurses’ Health Study II. Environ Epidemiol. 2020;4(1):e080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Iyer HS, Flanigan J, Wolf NG, Schroeder LF, Horton S, Castro MC, et al. Geospatial evaluation of trade-offs between equity in physical access to healthcare and health systems efficiency. BMJ Glob Health. 2020;5(10):e003493. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ray N, Ebener S. AccessMod 3.0: computing geographic coverage and accessibility to health care services using anisotropic movement of patients. Int J Health Geogr. 2008;7(1):63. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Mai G, Xie Y, Jia X, Lao N, Rao J, Zhu Q, et al. Towards the next generation of Geospatial Artificial Intelligence. Int J Appl Earth Obs Geoinf. 2025;1(136):104368. [Google Scholar]
  • 14.Janowicz K, Gao S, McKenzie G, Hu Y, Bhaduri B. GeoAI: spatially explicit artificial intelligence techniques for geographic knowledge discovery and beyond. Int J Geogr Inf Sci. 2020;34(4):625–36. [Google Scholar]
  • 15.Abdel Magid HS, Desjardins MR, Hu Y. Opportunities and shortcomings of AI for spatial epidemiology and health disparities research on aging and the life course. Health Place. 2024;1(89):103323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kamel Boulos MN, Peng G, VoPham T. An overview of GeoAI applications in health and healthcare. Int J Health Geogr. 2019;18(1):7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Züfle A, Anderson T, Kavak H, Pfoser D, Kim JS, Roess A. GeoAI for Public Health. In: Handbook of Geospatial Artificial Intelligence. CRC Press; 2023.
  • 18.Gao S, Hu Y, Li W, editors. Handbook of Geospatial Artificial Intelligence. Boca Raton: CRC Press; 2023. 468 p.
  • 19.Shilo S, Rossman H, Segal E. Axes of a revolution: challenges and promises of big data in healthcare. Nat Med. 2020;26(1):29–38. [DOI] [PubMed] [Google Scholar]
  • 20.Khera R, Butte AJ, Berkwits M, Hswen Y, Flanagin A, Park H, et al. AI in medicine—JAMA’s focus on clinical outcomes, patient-centered care, quality, and equity. JAMA. 2023;330(9):818–20. [DOI] [PubMed] [Google Scholar]
  • 21.Nogueira K, Penatti OAB, dos Santos JA. Towards better exploiting convolutional neural networks for remote sensing scene classification. Pattern Recogn. 2017;1(61):539–56. [Google Scholar]
  • 22.Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S. Feature Pyramid Networks for Object Detection [Internet]. arXiv; 2017 [cited 2025 May 5]. Available from: http://arxiv.org/abs/1612.03144
  • 23.Helber P, Bischke B, Dengel A, Borth D. EuroSAT: a novel dataset and deep learning benchmark for land use and land cover classification. IEEE J Sel Top Appl Earth Obs Remote Sens. 2019;12(7):2217–26. [Google Scholar]
  • 24.He J, Yuan Q, Li J, Xiao Y, Zhang L. A self-supervised remote sensing image fusion framework with dual-stage self-learning and spectral super-resolution injection. ISPRS J Photogramm Remote Sens. 2023;1(204):131–44. [Google Scholar]
  • 25.Lv Z, Liu J, Sun W, Lei T, Benediktsson JA, Jia X. Hierarchical attention feature fusion-based network for land cover change detection with homogeneous and heterogeneous remote sensing images. IEEE Trans Geosci Remote Sens. 2023;61:1–15. [Google Scholar]
  • 26.Zhang Y, Chen XJ, Gao S, Gong Y, Liu Y. Integrating smart card records and dockless bike-sharing data to understand the effect of the built environment on cycling as a feeder mode for metro trips. J Transp Geogr. 2024;1(121):103995. [Google Scholar]
  • 27.Wu Y, Zhang H, Li C, Tao S, Yang F. Urban ride-hailing demand prediction with multi-view information fusion deep learning framework. Appl Intell. 2023;53(8):8879–97. [Google Scholar]
  • 28.Yoo C, Weng Q. GeoAI for high-resolution urban air temperature estimation and urban heat island monitoring. In: Handbook of Geospatial Approaches to Sustainable Cities. CRC Press; 2024.
  • 29.Wang Z, Majumdar A, Rajagopal R. Geospatial mapping of distribution grid with machine learning and publicly-accessible multi-modal data. Nat Commun. 2023;14(1):5006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Irrgang C, Boers N, Sonnewald M, Barnes EA, Kadow C, Staneva J, et al. Towards neural Earth system modelling by integrating artificial intelligence in Earth system science. Nat Mach Intell. 2021;3(8):667–74. [Google Scholar]
  • 31.Wang Y, Ciais P, Broquet G, Bréon FM, Oda T, Lespinas F, et al. A global map of emission clumps for future monitoring of fossil fuel CO2 emissions from space. Earth System Science Data. 2019;11(2):687–703. [Google Scholar]
  • 32.Sharma E, Deo RC, Prasad R, Parisi AV. A hybrid air quality early-warning framework: An hourly forecasting model with online sequential extreme learning machines and empirical mode decomposition algorithms. Sci Total Environ. 2020;20(709):135934. [DOI] [PubMed] [Google Scholar]
  • 33.Rezvani SMHS, Silva MJF, de Almeida NM. Mapping geospatial AI flood risk in national road networks. ISPRS Int J Geo Inf. 2024;13(9):323. [Google Scholar]
  • 34.Sorice KA, Fang CY, Wiese D, Ortiz A, Chen Y, Henry KA, et al. Systematic review of neighborhood socioeconomic indices studied across the cancer control continuum. Cancer Med. 2022;11(10):2125–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Labib SM, Huck JJ, Lindley S. Modelling and mapping eye-level greenness visibility exposure using multi-source data at high spatial resolutions. Sci Total Environ. 2021;10(755):143050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Trinidad S, Brokamp C, Mor Huertas A, Beck AF, Riley CL, Rasnick E, et al. Use of area-based socioeconomic deprivation indices: A scoping review and qualitative analysis. Health Aff. 2022;41(12):1804–11. [DOI] [PubMed] [Google Scholar]
  • 37.Biljecki F, Ito K. Street view imagery in urban analytics and GIS: A review. Landsc Urban Plan. 2021;1(215):104217. [Google Scholar]
  • 38.Boutayeb A, Lahsen-Cherif I, Khadimi AE. When machine learning meets geospatial data: a comprehensive GeoAI review. IEEE J Sel Top Appl Earth Obs Remote Sens. 2025;18:13135–91. [Google Scholar]
  • 39.VoPham T, Hart JE, Laden F, Chiang YY. Emerging trends in geospatial artificial intelligence (geoAI): potential applications for environmental epidemiology. Environ Health. 2018;17(1):40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Yang J, Zhao L, McBride J, Gong P. Can you see green? Assessing the visibility of urban forests in cities. Landsc Urban Plann. 2009;91(2):97–104 (Jun 15;). [Google Scholar]
  • 41.Janga B, Asamani GP, Sun Z, Cristea N. A review of practical AI for remote sensing in earth sciences. Remote Sensing. 2023;15(16):4112. [Google Scholar]
  • 42.Gebru T, Krause J, Wang Y, Chen D, Deng J, Aiden EL, et al. Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States. Proc Natl Acad Sci. 2017;114(50):13108–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Lim CC, Kim H, Vilcassim MJR, Thurston GD, Gordon T, Chen LC, et al. Mapping urban air quality using mobile sampling with low-cost sensors and machine learning in Seoul, South Korea. Environ Int. 2019;131:105022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Wang Y, Wu X, Chen Z, Ren F, Feng L, Du Q. Optimizing the predictive ability of machine learning methods for landslide susceptibility mapping using SMOTE for Lishui City in Zhejiang Province, China. Int J Environ Res Public Health. 2019;16(3). [DOI] [PMC free article] [PubMed]
  • 45.Hagedorn B, Pratt M, Sweeney C, Becker M, Bram D, Chou B, et al. Assessing risk of groundwater pollution exposure from sea level rise in California. Sci Total Environ. 2025;10(989):179695. [DOI] [PubMed] [Google Scholar]
  • 46.Wu CD, Zhu JJ, Hsu CY, Shie RH. Quantifying source contributions to ambient NH(3) using Geo-AI with time lag and parcel tracking functions. Environ Int. 2024;185:108520. [DOI] [PubMed] [Google Scholar]
  • 47.Cardille JA, Crowley MA, Saah D, Clinton NE, editors. Cloud-based remote sensing with google earth engine: Fundamentals and applications [Internet]. Cham: Springer International Publishing; 2024 [cited 2023 Nov 16]. Available from: https://link.springer.com/10.1007/978-3-031-26588-4
  • 48.Wang B, Yuan Q, Yang Q, Zhu L, Li T, Zhang L. Estimate hourly PM(2.5) concentrations from Himawari-8 TOA reflectance directly using geo-intelligent long short-term memory network. Environ Pollut. 2021 Feb 15;271:116327. [DOI] [PubMed]
  • 49.Quistberg DA, Mooney SJ, Tasdizen T, Arbelaez P, Nguyen QC. Invited commentary: deep learning—methods to amplify epidemiologic data collection and analyses. Am J Epidemiol. 2025;194(2):322–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Schootman M, Gomez SL, Henry KA, Paskett ED, Ellison GL, Oh A, et al. Geospatial approaches to cancer control and population sciences. Cancer Epidemiol Biomarkers Prev. 2017;26(4):472–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Alex Quistberg D, Mooney SJ, Tasdizen T, Arbelaez P, Nguyen QC. Deep Learning - Methods to Amplify Epidemiological Data Collection and Analyses. Am J Epidemiol. 2024;kwae215. [DOI] [PMC free article] [PubMed]
  • 52.Serghiou S, Rough K. Deep learning for epidemiologists: An introduction to neural networks. Am J Epidemiol. 2023;192(11):1904–16. [DOI] [PubMed] [Google Scholar]
  • 53.Zhen Z, Lee H, Segovia-Dominguez I, Huang M, Chen Y, Garay M, et al. Environmental justice and lessons learned from COVID-19 outcomes—Uncovering hidden patterns with geometric deep learning and new NASA satellite data. 2024 Feb 29 [cited 2024 Nov 26]; Available from: https://journals.ametsoc.org/view/journals/aies/3/1/AIES-D-23-0040.1.xml
  • 54.Babaan J, Wong PY, Chen PC, Chen HL, Lung SCC, Chen YC, et al. Geospatial artificial intelligence for estimating daytime and nighttime nitrogen dioxide concentration variations in Taiwan: A spatial prediction model. J Environ Manage. 2024;360:121198. [DOI] [PubMed] [Google Scholar]
  • 55.Chambliss SE, Campmier MJ, Audirac M, Apte JS, Zigler CM. Local exposure misclassification in national models: relationships with urban infrastructure and demographics. J Expo Sci Environ Epidemiol. 2024;34(5):761–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Chen G, Li S, Knibbs LD, Hamm NAS, Cao W, Li T, et al. A machine learning method to estimate PM(2.5) concentrations across China with remote sensing, meteorological and land use information. Sci Total Environ. 2018;636:52–60. [DOI] [PubMed] [Google Scholar]
  • 57.Hsu CY, Lee RQ, Wong PY, Candice Lung SC, Chen YC, Chen PC, et al. Estimating morning and evening commute period O(3) concentration in Taiwan using a fine spatial-temporal resolution ensemble mixed spatial model with Geo-AI technology. J Environ Manage. 2024;351:119725. [DOI] [PubMed] [Google Scholar]
  • 58.Hsu CW, Chern YR, Su JJ, Wong PY, Asri AK, Wijaya C, et al. Assessing 3-D variability of ultrafine particle using a Geo-AI modelling approach: A case study in Zhunan-Miaoli Taiwan. Environ Pollut. 2025;23(383):126879. [DOI] [PubMed] [Google Scholar]
  • 59.Li Z. Forecasting weekly dengue cases by integrating Google Earth Engine-based risk predictor generation and Google Colab-based deep learning modeling in Fortaleza and the Federal District, Brazil. Int J Environ Res Public Health. 2022. 10.3390/ijerph192013555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Liang L, Daniels J, Bailey C, Hu L, Phillips R, South J. Integrating low-cost sensor monitoring, satellite mapping, and geospatial artificial intelligence for intra-urban air pollution predictions. Environ Pollut. 2023;331(Pt 1):121832. [DOI] [PubMed] [Google Scholar]
  • 61.Liu R, Ma Z, Gasparrini A, de la Cruz A, Bi J, Chen K. Integrating augmented in situ measurements and a spatiotemporal machine learning model to back extrapolate historical particulate matter pollution over the United Kingdom: 1980–2019. Environ Sci Technol. 2023;57(51):21605–15. [DOI] [PubMed] [Google Scholar]
  • 62.Lu J, Bu P, Xia X, Lu N, Yao L, Jiang H. Feasibility of machine learning methods for predicting hospital emergency room visits for respiratory diseases. Environ Sci Pollut Res Int. 2021;28(23):29701–9. [DOI] [PubMed] [Google Scholar]
  • 63.Lu QO, Chang WH, Chu HJ, Lee CC. Enhancing indoor PM(2.5) predictions based on land use and indoor environmental factors by applying machine learning and spatial modeling approaches. Environ Pollut. 2024;363(Pt 1):125093. [DOI] [PubMed] [Google Scholar]
  • 64.Ren X, Mi Z, Georgopoulos PG. Socioexposomics of COVID-19 across New Jersey: a comparison of geostatistical and machine learning approaches. J Expo Sci Environ Epidemiol. 2024;34(2):197–207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Shen H, Zhou M, Li T, Zeng C. Integration of remote sensing and social sensing data in a deep learning framework for hourly urban PM(2.5) mapping. Int J Environ Res Public Health. 2019. 10.3390/ijerph16214102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Tella A, Balogun AL. GIS-based air quality modelling: spatial prediction of PM10 for Selangor State, Malaysia using machine learning algorithms. Environ Sci Pollut Res Int. 2022;29(57):86109–25. [DOI] [PubMed] [Google Scholar]
  • 67.Wong PY, Su HJ, Candice Lung SC, Liu WY, Tseng HT, Adamkiewicz G, et al. Explainable geospatial-artificial intelligence models for the estimation of PM(2.5) concentration variation during commuting rush hours in Taiwan. Environ Pollut. 2024;349:123974. [DOI] [PubMed]
  • 68.Xu Y, Ho HC, Wong MS, Deng C, Shi Y, Chan TC, et al. Evaluation of machine learning techniques with multiple remote sensing datasets in estimating monthly concentrations of ground-level PM(2.5). Environ Pollut. 2018;242(Pt B):1417–26. [DOI] [PubMed]
  • 69.Kelly JT, Jang C, Timin B, Di Q, Schwartz J, Liu Y, et al. Examining PM2.5 concentrations and exposure using multiple models. Environ Res. 2021;196:110432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Beelen R, Hoek G, Vienneau D, Eeftens M, Dimakopoulou K, Pedeli X, et al. Development of NO2 and NOx land use regression models for estimating air pollution exposure in 36 study areas in Europe – The ESCAPE project. Atmos Environ. 2013;1(72):10–23. [Google Scholar]
  • 71.Ma R, Ban J, Wang Q, Li T. Statistical spatial-temporal modeling of ambient ozone exposure for environmental epidemiology studies: A review. Sci Total Environ. 2020;20(701):134463. [DOI] [PubMed] [Google Scholar]
  • 72.International Agency for Research on Cancer (IARC). Outdoor Air Pollution. Lyons: IARC; 2013. (IARC Monographs on the Evaluation of Carcinogenic Risks to Humans.). Report No.: 109.
  • 73.Di Q, Kloog I, Koutrakis P, Lyapustin A, Wang Y, Schwartz J. Assessing PM2.5 exposures with high spatiotemporal resolution across the continental United States. Environ Sci Technol. 2016;50(9):4712–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Hu X, Belle JH, Meng X, Wildani A, Waller LA, Strickland MJ, et al. Estimating PM2.5 concentrations in the conterminous United States using the random forest approach. Environ Sci Technol. 2017;51(12):6936–44. [DOI] [PubMed] [Google Scholar]
  • 75.Di Q, Amini H, Shi L, Kloog I, Silvern R, Kelly J, et al. An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution. Environ Int. 2019;130:104909. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Shen S, Li C, van Donkelaar A, Jacobs N, Wang C, Martin RV. Enhancing global estimation of fine particulate matter concentrations by including geophysical a priori information in deep learning. ACS EST Air. 2024;1(5):332–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Di Q, Wang Y, Zanobetti A, Wang Y, Koutrakis P, Choirat C, et al. Air pollution and mortality in the medicare population. N Engl J Med. 2017;376(26):2513–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Shi L, Steenland K, Li H, Liu P, Zhang Y, Lyles RH, et al. A national cohort study (2000–2018) of long-term air pollution exposure and incident dementia in older adults in the United States. Nat Commun. 2021;12(1):6754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Camps-Valls G, Fernández-Torres MÁ, Cohrs KH, Höhl A, Castelletti A, Pacal A, et al. Artificial intelligence for modeling and understanding extreme weather and climate events. Nat Commun. 2025;16(1):1919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Childs ML, Li J, Wen J, Heft-Neal S, Driscoll A, Wang S, et al. Daily local-level estimates of ambient wildfire smoke PM2.5 for the contiguous US. Environ Sci Technol. 2022;56(19):13607–21. [DOI] [PubMed] [Google Scholar]
  • 81.Li L, Girguis M, Lurmann F, Pavlovic N, McClure C, Franklin M, et al. Ensemble-based deep learning for estimating PM2.5 over California with multisource big data including wildfire smoke. Environ Int. 2020;145:106143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Shadrin D, Illarionova S, Gubanov F, Evteeva K, Mironenko M, Levchunets I, et al. Wildfire spreading prediction using multimodal data and deep neural network approach. Sci Rep. 2024;14(1):2606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Ahn SH, Jeong DH, Kim M, Lee TK, Kim HK. Prediction of groundwater quality index to assess suitability for drinking purpose using averaged neural network and geospatial analysis. Ecotoxicol Environ Saf. 2023;15(265):115485. [DOI] [PubMed] [Google Scholar]
  • 84.Chen J, Zhao L, Wang B, He X, Duan L, Yu G. Uncovering global risk to human and ecosystem health from pesticides in agricultural surface water using a machine learning approach. Environ Int. 2024;194(Dec):109154. [DOI] [PubMed] [Google Scholar]
  • 85.Chen X, Zhao C, Chen J, Jiang H, Li D, Zhang J, et al. Water quality parameters-based prediction of dissolved oxygen in estuaries using advanced explainable ensemble machine learning. J Environ Manage. 2025;380:125146. [DOI] [PubMed] [Google Scholar]
  • 86.Durrani TS, Akhtar MM, Kakar KU, Khan MN, Muhammad F, Khan M, et al. Geochemical evolution, geostatistical mapping and machine learning predictive modeling of groundwater fluoride: a case study of western Balochistan, Quetta. Environ Geochem Health. 2024;47(2):32. [DOI] [PubMed] [Google Scholar]
  • 87.Hossain M, Wiegand B, Reza A, Chaudhuri H, Mukhopadhyay A, Yadav A, et al. A machine learning approach to investigate the impact of land use land cover (LULC) changes on groundwater quality, health risks and ecological risks through GIS and response surface methodology (RSM). J Environ Manage. 2024;366:121911. [DOI] [PubMed] [Google Scholar]
  • 88.Podgorski J, Araya D, Berg M. Geogenic manganese and iron in groundwater of Southeast Asia and Bangladesh - Machine learning spatial prediction modeling and comparison with arsenic. Sci Total Environ. 2022;10(833):155131. [DOI] [PubMed] [Google Scholar]
  • 89.Podgorski J, Wu R, Chakravorty B, Polya DA. Groundwater arsenic distribution in India by machine learning geospatial modeling. Int J Environ Res Public Health. 2020;17(19):7119. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Sarigai, Yang J, Zhou A, Han L, Li Y, Xie Y. Monitoring urban black-odorous water by using hyperspectral data and machine learning. Environ Pollut. 2021;269:116166. [DOI] [PubMed]
  • 91.Shin HM, Vieira VM, Ryan PB, Detwiler R, Sanders B, Steenland K, et al. Environmental fate and transport modeling for perfluorooctanoic acid emitted from the Washington works facility in West Virginia. Environ Sci Technol. 2011;45(4):1435–42. [DOI] [PubMed] [Google Scholar]
  • 92.Ward MH, Cantor KP, Riley D, Merkle S, Lynch CF. Nitrate in public water supplies and risk of bladder cancer. Epidemiology. 2003;14(2):183. [DOI] [PubMed] [Google Scholar]
  • 93.Murray A, Hall A. Community water system service areas [Internet]. Cincinnati, OH: Center for environmental solutions and emergency response; 2024 May [cited 2025 Apr 16]. (Office of Research and Development). Available from: https://epa.maps.arcgis.com/home/item.html?id=609326a2d1024f2da42ac53f68f40db6
  • 94.Hu XC, Andrews DQ, Lindstrom AB, Bruton TA, Schaider LA, Grandjean P, et al. Detection of poly- and perfluoroalkyl substances (PFASs) in U.S. drinking water linked to industrial sites, military fire training areas, and wastewater treatment plants. Environ Sci Technol Lett. 2016;3(10):344–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Post GB, Louis JB, Lippincott RL, Procopio NA. Occurrence of perfluorinated compounds in raw water from new jersey public drinking water systems. Environ Sci Technol. 2013;47(23):13266–75. [DOI] [PubMed] [Google Scholar]
  • 96.Steenland K, Winquist A. PFAS and cancer, a scoping review of the epidemiologic evidence. Environ Res. 2021;194(1):110690. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Tokranov AK, Ransom KM, Bexfield LM, Lindsey BD, Watson E, Dupuy DI, et al. Predictions of groundwater PFAS occurrence at drinking water supply depths in the United States. Science. 2024;386(6723):748–55. [DOI] [PubMed] [Google Scholar]
  • 98.Roberts EM, Madrigal D, Valle J, King G, Kite L. Assessing child lead poisoning case ascertainment in the US, 1999–2010. Pediatrics. 2017;139(5):e20164266. [DOI] [PubMed] [Google Scholar]
  • 99.Spanier AJ, Wilson S, Ho M, Hornung R, Lanphear BP. The contribution of housing renovation to children’s blood lead levels: a cohort study. Environ Health. 2013;12(1):72. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Huynh BQ, Chin ET, Kiang MV. Estimated childhood lead exposure from drinking water in Chicago. JAMA Pediatr. 2024;178(5):473–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 101.Hajiseyedjavadi S, Blackhurst M, Karimi HA. A Machine Learning Approach to Identify Houses with High Lead Tap Water Concentrations. In: Proceedings of the AAAI Conference on Artificial Intelligence [Internet]. 2020 [cited 2025 Apr 16]. p. 13300–5. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/7040
  • 102.Chen THK, Horsdal HT, Samuelsson K, Closter AM, Davies M, Barthel S, et al. Higher depression risks in medium- than in high-density urban form across Denmark. Sci Adv. 2023;9(21):eadf3760. [DOI] [PMC free article] [PubMed]
  • 103.Dahu BM, Martinez-Villar CI, Toubal IE, Alshehri M, Ouadou A, Khan S, et al. Application of machine learning and deep neural visual features for predicting adult obesity prevalence in Missouri. Int J Environ Res Public Health. 2024. 10.3390/ijerph21111534. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Lotfata A, Moosazadeh M, Helbich M, Hoseini B. Socioeconomic and environmental determinants of asthma prevalence: a cross-sectional study at the U.S. county level using geographically weighted random forests. Int J Health Geogr. 2023;22(1):18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Pala D, Caldarone AA, Franzini M, Malovini A, Larizza C, Casella V, et al. Deep learning to unveil correlations between urban landscape and population health. Sensors. 2020. 10.3390/s20072105. (Apr 8;) [DOI] [PMC free article] [PubMed]
  • 106.Pavicic M, Walker AM, Sullivan KA, Lagergren J, Cliff A, Romero J, et al. Using iterative random forest to find geospatial environmental and sociodemographic predictors of suicide attempts. Front Psychiatry. 2023;14:1178633. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Rachele JN, Wang J, Wijnands JS, Zhao H, Bentley R, Stevenson M. Using machine learning to examine associations between the built environment and physical function: a feasibility study. Health Place. 2021;70:102601. [DOI] [PubMed] [Google Scholar]
  • 108.Talukder H, Muñoz-Zanzi C, Salgado M, Berg S, Yang A. Identifying the drivers related to animal reservoirs, environment, and socio-demography of human leptospirosis in different community types of Southern Chile: an application of machine learning algorithm in one health perspective. Pathogens. 2024. 10.3390/pathogens13080687. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.James P, Banay RF, Hart JE, Laden F. A review of the health benefits of greenness. Curr Epidemiol Rep. 2015;2(2):131–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Fong KC, Hart JE, James P. A review of epidemiologic studies on greenness and health: updated literature through 2017. Curr Environ Health Rep. 2018;5(1):77–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 111.Jimenez MP, DeVille NV, Elliott EG, Schiff JE, Wilt GE, Hart JE, et al. Associations between nature exposure and health: A review of the evidence. Int J Environ Res Public Health. 2021;18(9):4790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Iyer HS, Hart JE, James P, Elliott EG, DeVille NV, Holmes MD, et al. Impact of neighborhood socioeconomic status, income segregation, and greenness on blood biomarkers of inflammation. Environ Int. 2022;1(162):107164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Jimenez MP, Elliott EG, DeVille NV, Laden F, Hart JE, Weuve J, et al. Residential green space and cognitive function in a large cohort of middle-aged women. JAMA Netw Open. 2022;5(4):e229306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Banay RF, James P, Hart JE, Kubzansky LD, Spiegelman D, Okereke OI, et al. Greenness and depression incidence among older women. Environ Health Perspect. 2019;127(2):027001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115.Klompmaker JO, Laden F, Browning MHEM, Dominici F, Ogletree SS, Rigolon A, et al. Associations of parks, greenness, and blue space with cardiovascular and respiratory disease hospitalization in the US Medicare cohort. Environ Pollut. 2022;1(312):120046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 116.Klompmaker JO, Laden F, Browning MHEM, Dominici F, Jimenez MP, Ogletree SS, et al. Associations of greenness, parks, and blue space with neurodegenerative disease hospitalizations among older US adults. JAMA Netw Open. 2022;5(12):e2247664. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117.Halder B, Ahmadianfar I, Heddam S, Mussa ZH, Goliatt L, Tan ML, et al. Machine learning-based country-level annual air pollutants exploration using Sentinel-5P and Google Earth Engine. Sci Rep. 2023;13(1):7968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Safarianzengir V, Sobhani B, Yazdani MH, Kianian M. Monitoring, analysis and spatial and temporal zoning of air pollution (carbon monoxide) using Sentinel-5 satellite data for health management in Iran, located in the Middle East. Air Qual Atmos Health. 2020;13(6):709–19. [Google Scholar]
  • 119.Larkin A, Hystad P. Evaluating street view exposure measures of visible green space for health research. J Expo Sci Environ Epidemiol. 2019;29(4):447–56. [DOI] [PubMed] [Google Scholar]
  • 120.Luo J, Zhai S, Song G, He X, Song H, Chen J, et al. Assessing inequity in green space exposure toward a “15-Minute City” in Zhengzhou, China: Using deep learning and urban big data. Int J Environ Res Public Health. 2022. 10.3390/ijerph19105798. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Xu J, Liu Y, Liu Y, An R, Tong Z. Integrating street view images and deep learning to explore the association between human perceptions of the built environment and cardiovascular disease in older adults. Soc Sci Med. 2023;338:116304. [DOI] [PubMed] [Google Scholar]
  • 122.Yue X, Antonietti A, Alirezaei M, Tasdizen T, Li D, Nguyen L, et al. Using convolutional neural networks to derive neighborhood built environments from Google street view images and examine their associations with health outcomes. Int J Environ Res Public Health. 2022;19(19):12095. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123.Zhang A, Zhai S, Liu X, Song G, Feng Y. Investigating the association between streetscapes and mental health in Zhanjiang, China: using Baidu street view images and deep learning. Int J Environ Res Public Health. 2022. 10.3390/ijerph192416634. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Zhao H, Shi J, Qi X, Wang X, Jia J. Pyramid scene parsing network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) [Internet]. 2017 [cited 2025 Apr 16]. p. 6230–9. Available from: https://ieeexplore.ieee.org/document/8100143
  • 125.Yi L, Harnois-Leblanc S, Rifas-Shiman SL, Suel E, Pescador Jimenez M, Lin PID, et al. Satellite-based and street-view green space and adiposity in US children. JAMA Netw Open. 2024;7(12):e2449113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Yi L, Rifas-Shiman S, Pescador Jimenez M, Lin PID, Suel E, Hystad P, et al. Assessing greenspace and cardiovascular health through deep-learning analysis of street-view imagery in a cohort of US children. Environ Res. 2025;15(265):120459. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Yi L, Hart JE, Roscoe C, Mehta UV, Pescador Jimenez M, Lin PID, et al. Greenspace and depression incidence in the US-based nationwide Nurses’ Health Study II: A deep learning analysis of street-view imagery. Environ Int. 2025;1(198):109429. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128.Klompmaker JO, Mork D, Zanobetti A, Braun D, Hankey S, Hart JE, et al. Associations of street-view greenspace with Parkinson’s disease hospitalizations in an open cohort of elderly US Medicare beneficiaries. Environ Int. 2024;188:108739. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 129.Nguyen QC, Khanna S, Dwivedi P, Huang D, Huang Y, Tasdizen T, et al. Using Google Street View to examine associations between built environment characteristics and U.S. health outcomes. Prevent Med Reports. 2019;14:100859. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 130.Nguyen QC, Huang Y, Kumar A, Duan H, Keralis JM, Dwivedi P, et al. Using 164 million google street view images to derive built environment predictors of COVID-19 cases. Int J Environ Res Public Health. 2020;17(17):6359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 131.Liu Y, Yi L, Xu Y, Cabison J, Eckel SP, Mason TB, et al. Spatial and temporal determinants of particulate matter peak exposures during pregnancy and early postpartum. Environ Adv. 2024;1(17):100557. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 132.Xie S, Meeker JR, Perez L, Eriksen W, Localio A, Park H, et al. Feasibility and acceptability of monitoring personal air pollution exposure with sensors for asthma self-management. Asthma Res Pract. 2021;7(1):13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 133.Ryan I, Deng X, Thurston G, Khwaja H, Romeiko X, Zhang W, et al. Measuring students’ exposure to particulate matter (PM) pollution across microenvironments and seasons using personal air monitors. Environ Monit Assess. 2022;195(1):103. [DOI] [PubMed] [Google Scholar]
  • 134.Rappold Ag, Hano Mc, Prince S, Wei L, Huang Sm, Baghdikian C, et al. Smoke sense initiative leverages citizen science to address the growing wildfire-related public health problem. Geohealth. 2019;3(12):443–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 135.Helbig C, Ueberham M, Becker AM, Marquart H, Schlink U. Wearable sensors for human environmental exposure in urban settings. Curr Pollution Rep. 2021;7(3):417–33. [Google Scholar]
  • 136.Lin X, Luo J, Liao M, Su Y, Lv M, Li Q, et al. Wearable sensor-based monitoring of environmental exposures and the associated health effects: a review. Biosensors. 2022;12(12):1131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 137.Montanari A, Fancello G, Sueur C, Kestens Y, van Lenthe FJ, Chaix B. A sensor-based study on the environmental determinants of sleep in older adults. Environ Res. 2025;1(274):120874. [DOI] [PubMed] [Google Scholar]
  • 138.Straczkiewicz M, Huang EJ, Onnela JP. A “one-size-fits-most” walking recognition method for smartphones, smartwatches, and wearable accelerometers. Npj Digit Med. 2023;6(1):1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 139.Yi L, Hart JE, Straczkiewicz M, Karas M, Wilt GE, Hu CR, et al. Measuring environmental and behavioral drivers of chronic diseases using smartphone-based digital phenotyping: Intensive longitudinal observational mHealth substudy embedded in 2 prospective cohorts of adults. JMIR Public Health Surveill. 2024;10(1):e55170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 140.Chaix B. Mobile sensing in environmental health and neighborhood research. Annu Rev Public Health. 2018;39(1):367. [DOI] [PubMed] [Google Scholar]
  • 141.James P, Jankowska M, Marx C, Hart JE, Berrigan D, Kerr J, et al. “Spatial Energetics”: Integrating data from GPS, accelerometry, and GIS to address obesity and inactivity. Am J Prev Med. 2016;51(5):792–800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 142.Onnela JP. Opportunities and challenges in the collection and analysis of digital phenotyping data. Neuropsychopharmacology. 2021;46(1):45–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 143.Yi L, Xu Y, O’Connor S, Cabison J, Rosales M, Chu D, et al. GPS-derived environmental exposures during pregnancy and early postpartum - Evidence from the madres cohort. Sci Total Environ. 2024;25(918):170551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 144.Hystad P, Amram O, Oje F, Larkin A, Boakye K, Avery A, et al. Bring your own location data: Use of google smartphone location history data for environmental health research. Environ Health Perspect. 2022;130(11):117005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 145.Tao Y, Chai Y, Kou L, Kwan MP. Understanding noise exposure, noise annoyance, and psychological stress: Incorporating individual mobility and the temporality of the exposure-effect relationship. Appl Geogr. 2020;1(125):102283. [Google Scholar]
  • 146.Yi L, Wilson JP, Mason TB, Habre R, Wang S, Dunton GF. Methodologies for assessing contextual exposure to the built environment in physical activity studies: a systematic review. Health Place. 2019;60(Nov):102226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 147.Straczkiewicz M, James P, Onnela JP. A systematic review of smartphone-based human activity recognition methods for health research. NPJ Digit Med. 2021;4(1):148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 148.Fischer F, Kleen S. Possibilities, problems, and perspectives of data collection by mobile apps in longitudinal epidemiological studies: Scoping review. J Med Internet Res. 2021;23(1):e17691. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 149.Wang S, Liang C, Gao Y, Ye Y, Qiu J, Tao C, et al. Social media insights into spatio-temporal emotional responses to COVID-19 crisis. Health Place. 2024;85:103174. [DOI] [PubMed] [Google Scholar]
  • 150.Hswen Y, Qin Q, Brownstein JS, Hawkins JB. Feasibility of using social media to monitor outdoor air pollution in London England. Prevent Med. 2019;1(121):86–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 151.Finch KC, Snook KR, Duke CH, Fu KW, Tse ZTH, Adhikari A, et al. Public health implications of social media use during natural disasters, environmental disasters, and other environmental concerns. Nat Hazards. 2016;83(1):729–60. [Google Scholar]
  • 152.Muniz-Rodriguez K, Ofori SK, Bayliss LC, Schwind JS, Diallo K, Liu M, et al. Social media use in emergency response to natural disasters: a systematic review with a public health perspective. Disaster Med Public Health Prep. 2020;14(1):139–49. [DOI] [PubMed] [Google Scholar]
  • 153.Yuan F, Liu R. Feasibility study of using crowdsourcing to identify critical affected areas for rapid damage assessment: Hurricane Matthew case study. Intl J Disaster Risk Reduct. 2018;1(28):758–67. [Google Scholar]
  • 154.Kestens Y, Wasfi R, Naud A, Chaix B. “Contextualizing Context”: Reconciling environmental exposures, social networks, and location preferences in health research. Curr Envir Health Rpt. 2017;4(1):51–60. [DOI] [PubMed] [Google Scholar]
  • 155.Sadilek A, Hswen Y, Bavadekar S, Shekel T, Brownstein JS, Gabrilovich E. Lymelight: forecasting lyme disease risk using web search data. NPJ Digit Med. 2020;3:16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 156.Kim Y, Kim JH. Using photos for public health communication: a computational analysis of the Centers for Disease Control and Prevention Instagram photos and public responses. Health Informatics J. 2020;26(3):2159–80. [DOI] [PubMed] [Google Scholar]
  • 157.Eichstaedt JC, Smith RJ, Merchant RM, Ungar LH, Crutchley P, Preoţiuc-Pietro D, et al. Facebook language predicts depression in medical records. Proc Natl Acad Sci U S A. 2018;115(44):11203–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 158.Schoene AM, Basinas I, van Tongeren M, Ananiadou S. A narrative literature review of natural language processing applied to the occupational exposome. Int J Environ Res Public Health. 2022;19(14):8544. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 159.Schembri J, Gentile R. Augmenting natural hazard exposure modelling using natural language processing. Int J Disaster Risk Reduct. 2024;101(Feb 1):104202. [Google Scholar]
  • 160.Tobler WR. A computer movie simulating urban growth in the Detroit region. Econ Geogr. 1970;46:234–40. [Google Scholar]
  • 161.Goodchild MF, Li W. Replication across space and time must be weak in the social and environmental sciences. Proc Natl Acad Sci. 2021;118(35):e2015759118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 162.Mai G, Janowicz K, Hu Y, Gao S, Yan B, Zhu R, et al. A review of location encoding for GeoAI: methods and applications. Int J Geogr Inf Sci. 2022;36(4):639–73. [Google Scholar]
  • 163.Gao S, Rao J, Liang Y, Kang Y, Zhu J, Zhu R. GeoAI Methodological Foundations: Deep Neural Networks and Knowledge Graphs. In: Handbook of Geospatial Artificial Intelligence. CRC Press; 2023.
  • 164.Mai G, Li Z, Lao N. Spatial representation learning in GeoAI. In: Handbook of Geospatial Artificial Intelligence. CRC Press; 2023.
  • 165.Zhao L, Song Y, Zhang C, Liu Y, Wang P, Lin T, et al. T-GCN: a temporal graph convolutional network for traffic prediction. IEEE Trans Intell Transp Syst. 2020;21(9):3848–58. [Google Scholar]
  • 166.Liang Y, Zhu J, Ye W, Gao S. Region2Vec: community detection on spatial networks using graph embedding with node attributes and spatial interactions. In: Proceedings of the 30th International Conference on Advances in Geographic Information Systems [Internet]. New York, NY, USA: Association for Computing Machinery; 2022 [cited 2025 Jul 30]. p. 1–4. (SIGSPATIAL ’22). Available from: https://dl.acm.org/doi/10.1145/3557915.3560974
  • 167.Reich BJ, Yang S, Guan Y, Giffin AB, Miller MJ, Rappold A. A review of spatial causal inference methods for environmental and epidemiological applications. Int Stat Rev. 2021;89(3):605–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 168.Lee J, Ma S, Serban N, Yang S. Accurate treatment effect estimation using inverse probability of treatment weighting with deep learning. JAMIA Open. 2025;8(2):ooaf032. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 169.Higbee JD, Lefler JS, Burnett RT, Ezzati M, Marshall JD, Kim SY, et al. Estimating long-term pollution exposure effects through inverse probability weighting methods with Cox proportional hazards models. Environ Epidemiol. 2020;4(2):e085. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 170.Giffin A, Reich BJ, Yang S, Rappold AG. Generalized propensity score approach to causal inference with spatial interference. Biometrics. 2023;79(3):2220–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 171.Wild CP. The exposome: from concept to utility. Int J Epidemiol. 2012;41(1):24–32. [DOI] [PubMed] [Google Scholar]
  • 172.Stingone JA, Buck Louis GM, Nakayama SF, Vermeulen RCH, Kwok RK, Cui Y, et al. Toward greater implementation of the exposome research paradigm within environmental epidemiology. Annu Rev Public Health. 2017;20(38):315–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 173.Miller GW. Exposomics: perfection not required. Exposome. 2024;4(1):osae006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 174.Patel CJ. Analytic complexity and challenges in identifying mixtures of exposures associated with phenotypes in the Exposome era. Curr Epidemiol Rep. 2017;4(1):22–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 175.Hu H, Liu X, Zheng Y, He X, Hart J, James P, et al. Methodological challenges in spatial and contextual exposome-health studies. Crit Rev Environ Sci Technol. 2023;53(7):827–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 176.Hamra GB, Buckley JP. Environmental exposure mixtures: questions and methods to address them. Curr Epidemiol Rep. 2018;5(2):160–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 177.Bobb JF, Valeri L, Claus Henn B, Christiani DC, Wright RO, Mazumdar M, et al. Bayesian kernel machine regression for estimating the health effects of multi-pollutant mixtures. Biostatistics. 2015;16(3):493–508. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 178.Colicino E, Pedretti NF, Busgang SA, Gennings C. Per- and poly-fluoroalkyl substances and bone mineral density: results from the Bayesian weighted quantile sum regression. Environ Epidemiol. 2020;4(3):e092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 179.Keil AP, Buckley JP, Kalkbrenner AE. Bayesian G-Computation for estimating impacts of interventions on exposure mixtures: Demonstration with metals from coal-fired power plants and birth weight. Am J Epidemiol. 2021;190(12):2647–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 180.Czarnota J, Gennings C, Wheeler DC. Assessment of weighted quantile sum regression for modeling chemical mixtures and cancer risk. Cancer Inform. 2015;14(Suppl 2):159–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 181.Maitre L, Guimbaud JB, Warembourg C, Güil-Oumrait N, Petrone PM, Chadeau-Hyam M, et al. State-of-the-art methods for exposure-health studies: Results from the exposome data challenge event. Environ Int. 2022;1(168):107422. [DOI] [PubMed] [Google Scholar]
  • 182.Bellavia A, Dickerson AS, Rotem RS, Hansen J, Gredal O, Weisskopf MG. Joint and interactive effects between health comorbidities and environmental exposures in predicting amyotrophic lateral sclerosis. Int J Hyg Environ Health. 2021;231:113655. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 183.Pan S, Li Z, Rubbo B, Quon-Chow V, Chen JC, Baumert BO, et al. Applications of mixture methods in epidemiological studies investigating the health impact of persistent organic pollutants exposures: a scoping review. J Expo Sci Environ Epidemiol. 2024 Sep 10; [DOI] [PMC free article] [PubMed]
  • 184.Midya V, Alcala CS, Rechtman E, Gregory JK, Kannan K, Hertz-Picciotto I, et al. Machine learning assisted discovery of interactions between pesticides, phthalates, phenols, and trace elements in child neurodevelopment. Environ Sci Technol. 2023;57(46):18139–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 185.Renzetti S, Gennings C, Calza S. A weighted quantile sum regression with penalized weights and two indices. Front Public Health [Internet]. 2023 Jul 18 [cited 2025 Apr 16];11. Available from: https://www.frontiersin.orghttps://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2023.1151821/full [DOI] [PMC free article] [PubMed]
  • 186.Isola S, Murdaca G, Brunetto S, Zumbo E, Tonacci A, Gangemi S. The use of artificial intelligence to analyze the exposome in the development of chronic diseases: A review of the current literature. Informatics. 2024;11(4):86. [Google Scholar]
  • 187.Yu L, Liu W, Wang X, Ye Z, Tan Q, Qiu W, et al. A review of practical statistical methods used in epidemiological studies to estimate the health effects of multi-pollutant mixture. Environ Pollut. 2022;1(306):119356. [DOI] [PubMed] [Google Scholar]
  • 188.Benà E, Ciotoli G, Petermann E, Bossew P, Ruggiero L, Verdi L, et al. A new perspective in radon risk assessment: Mapping the geological hazard as a first step to define the collective radon risk exposure. Sci Total Environ. 2024;20(912):169569. [DOI] [PubMed] [Google Scholar]
  • 189.Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Networks. 1994;5(2):157–66. [DOI] [PubMed] [Google Scholar]
  • 190.Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80. [DOI] [PubMed] [Google Scholar]
  • 191.Openshaw S. The Modifiable Areal Unit Problem. Geo Books; 1983. 40 p.
  • 192.Huo Z, Wen J, Li Z, Chen D, Xi M, Li Y, et al. Spatial interpolation of global DEM using federated deep learning. Sci Rep. 2024;14(1):22089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 193.Stewart Fotheringham A, Rogerson PA. GIS and spatial analytical problems. Int J Geogr Inf Syst. 1993;7(1):3–19. [Google Scholar]
  • 194.Colacci M, Huang YQ, Postill G, Zhelnov P, Fennelly O, Verma A, et al. Sociodemographic bias in clinical machine learning models: a scoping review of algorithmic bias instances and mechanisms. J Clin Epidemiol. 2025;1(178):111606. [DOI] [PubMed] [Google Scholar]
  • 195.Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447–53. [DOI] [PubMed] [Google Scholar]
  • 196.Norori N, Hu Q, Aellen FM, Faraci FD, Tzovara A. Addressing bias in big data and AI for health care: A call for open science. Patterns (N Y). 2021;2(10):100347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 197.Downs TJ, Ogneva-Himmelberger Y, Aupont O, Wang Y, Raj A, Zimmerman P, et al. Vulnerability-based spatial sampling stratification for the National Children’s Study, Worcester County, Massachusetts: capturing health-relevant environmental and sociodemographic variability. Environ Health Perspect. 2010;118(9):1318–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 198.Chen RJ, Wang JJ, Williamson DFK, Chen TY, Lipkova J, Lu MY, et al. Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat Biomed Eng. 2023;7(6):719–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 199.Karasaki S, Morello-Frosch R, Callaway D. Machine learning for environmental justice: Dissecting an algorithmic approach to predict drinking water quality in California. Sci Total Environ. 2024;15(951):175730. [DOI] [PubMed] [Google Scholar]
  • 200.Kwan MP, Casas I, Schmitz BC. Protection of geoprivacy and accuracy of spatial information: How effective are geographical masks? Cartographica. 2004;39(2):15–28. [Google Scholar]
  • 201.Iyer HS, Shi X, Satagopan JM, Cheng I, Roscoe C, McLaughlin RH, et al. Advancing social and environmental research in cancer registries using geomasking for address-level data. Cancer Epidemiol Biomark Prev. 2023;32(11):1485–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 202.Brokamp C, Wolfe C, Lingren T, Harley J, Ryan P. Decentralized and reproducible geocoding and characterization of community and environmental exposures for multisite studies. J Am Med Inform Assoc. 2018;25(3):309–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 203.Arora A, Wagner SK, Carpenter R, Jena R, Keane PA. The urgent need to accelerate synthetic data privacy frameworks for medical research. The Lancet Digital Health. 2025;7(2):e157–60. [DOI] [PubMed] [Google Scholar]
  • 204.Krieger N, Waterman P, Lemieux K, Zierler S, Hogan JW. On the wrong side of the tracts? Evaluating the accuracy of geocoding in public health research. Am J Public Health. 2001;91(7):1114–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 205.Perchoux C, Chaix B, Cummins S, Kestens Y. Conceptualization and measurement of environmental exposure in epidemiology: Accounting for activity space related to daily mobility. Health Place. 2013;1(21):86–93. [DOI] [PubMed] [Google Scholar]
  • 206.Ge L, Yang C, Zucker D, Li J, Spiegelman D, Wang M. Measurement Error Correction for Spatially Defined Environmental Exposures in Survival Analysis [Internet]. arXiv; 2024 [cited 2025 Apr 17]. Available from: http://arxiv.org/abs/2410.09278
  • 207.Liu Y, Kwan Mei-Po, Wang J. Analytically articulating the effect of buffer size on urban green space exposure measures. Int J Geogr Inf Sci. 2025;39(2):255–76. [Google Scholar]
  • 208.Lou X, Luo P, Meng L. GeoConformal prediction: a model-agnostic framework of measuring the uncertainty of spatial prediction [Internet]. arXiv; 2024 [cited 2025 Apr 17]. Available from: http://arxiv.org/abs/2412.08661

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

No datasets were generated or analysed during the current study.


Articles from Current Environmental Health Reports are provided here courtesy of Springer

RESOURCES