Abstract
Environmental Protection Agency (EPA) air quality (AQ) monitors, the “gold standard” for measuring air pollutants, are sparsely positioned across the US. Low-cost sensors (LCS) are increasingly being used by the public to fill in the gaps in AQ monitoring; however, LCS are not as accurate as EPA monitors. In this work, we investigate factors impacting the differences between an individual’s true (unobserved) exposure to air pollution and the exposure reported by their nearest AQ instrument (which could be either an LCS or an EPA monitor). We use simulations based on California data to explore different combinations of hypothetical LCS placement strategies (e.g. at schools or near major roads), for different numbers of LCS, with varying plausible amounts of LCS device measurement error. We illustrate how real-time AQ reporting could be improved (or, in some cases, worsened) by using LCS, both for the population overall and for marginalized communities specifically. This work has implications for the integration of LCS into real-time AQ reporting platforms.
Keywords: air quality, low-cost sensors, environmental justice, information access, decision-making
Graphical Abstract

1. Introduction
Decades of research have documented the adverse health impacts of both short- and long-term exposure to air pollution. In this study, we focus on fine particulate matter (PM2.5), an air pollutant that has been associated with various adverse health outcomes1–3. In the US, although air quality (AQ) across the years has been improving overall, disparities between PM2.5 concentrations experienced by subpopulations persist4–6. In addition, certain parts of the US are experiencing, or are likely to experience, higher air pollution (including PM2.5 exposure) from climate-change related events and processes such as exacerbated wildfires and dust storms7. To develop effective AQ management plans and address key concerns of equity, accurate and high resolution AQ monitoring data are needed.
The Federal Reference Method or Federal Equivalence Method (FRM or FEM) monitors deployed by the US Environmental Protection Agency (EPA) are considered the “gold standard” for measuring AQ. However, due to the prohibitively high capital (USD $10,000+) and operating costs8 of these instruments, they have been deployed in less than a third of all US counties9. Even in counties that have an EPA monitor, the often substantial within-county variability in AQ can render measurements unrepresentative of the pollution levels experienced by many residents of the county10. As these monitors tend to be deployed in more populous locations11,12, residents in rural areas tend to be farther away from monitors.
In recent years, low-cost AQ sensors (< USD $2,500 as defined by the EPA Air Sensor Guidebook13, but often much cheaper) have been gaining attention as a supplement to FRM/FEM monitoring, and many are deployed by private citizens in addition to public and private entities14. Key motivations of individuals (ascertained from online product reviews) include managing health impacts of wildfire smoke and other air pollution as well as detecting air pollution sources of concern15. Networks of these sensors can help increase the spatial resolution and frequency of AQ measurements16. However, the measurements from low-cost sensors (henceforth, LCS) have lower accuracy than EPA monitors, and can be further affected by environmental conditions such as relative humidity, temperature, and aerosol composition14,17–19. Recent work has shown that algorithmic correction can reduce the error in LCS, but does not eliminate the problem20–23. These studies have also highlighted the challenge of developing corrections that are transferable across measurements collected in different locations and/or time periods. Despite these drawbacks, research has shown that LCS can be useful in identifying air pollution hotspots, expanding local awareness about AQ and health, and alerting the public of short-term changes in AQ, which may facilitate reduction of exposure to air pollution, e.g. by individuals choosing not to exercise outdoors on a high pollution day13,24–26.
To our knowledge, the question of how incorporating measurements from LCS into real-time reporting systems affects the accuracy of people’s AQ information remains unanswered. In addition to sensor measurement error, an important factor that must be considered is the spatial distribution (both spatial density and relative placement) of LCS. In this study, we assume that individuals view AQ data from the nearest instrument – EPA monitor or LCS – (e.g., as shown on a smartphone app) as their current AQ exposure. Despite the potential for LCS to increase and democratize access to AQ information, a 2021 study found that locations of PurpleAir (one of the most widely used LCS brands, costing around $250 per sensor) in the US tend to be disproportionately located in neighborhoods that have higher incomes and higher percentages of white residents, compared both to the locations of EPA monitors and to the US overall14. This suggests that, relative to more privileged groups, residents of marginalized neighborhoods (who, as previously noted, tend to experience worse air quality) may have less access to information about their local AQ, a precursor to adaptive actions which could be taken to protect their health.
In summary, while collection and dissemination of LCS data have been increasing, there is a need for evaluating the impact of integrating LCS data into AQ reporting platforms. In this study, we investigate how altering the number of LCS deployed, amount and type of sensor measurement error, and relative placements of LCS affects the accuracy of daily AQ information available to individuals from their nearest AQ instrument (defined to be either an LCS or an EPA monitor). Our main objectives are to (a) increase the nuance with which various groups (scientists, policy makers, community organizations, etc.) can think about and discuss tradeoffs when it comes to measuring and reporting AQ to the public and (b) to suggest directions for future work to both the LCS instrument / data science and environmental health / policy communities. We focus on PM2.5 and base our LCS investigation on PurpleAir sensors, the data from which is used in a number of popular regional real-time AQ maps27,28.
Our analysis consists first of simulating realistic LCS PM2.5 measurements under numerous hypothetical LCS deployment and sensor measurement error scenarios in the state of California. Then, we compare the local AQ information available to individuals in each simulation scenario with that produced by (i) EPA monitors only, as well as (ii) the existing PurpleAir sensor network. We dedicate special attention to evaluating how each scenario impacts disparities in AQ information accuracy for marginalized groups, such as those living in communities with high rates of poverty or with high proportions of nonwhite or Hispanic residents. Our findings can be used to inform decisions about (a) where to place LCS to make real-time AQ reporting more accurate and equitable, (b) how many LCS to deploy, (c) whether existing sensor calibration approaches yield sufficient accuracy to justify use of LCS for real-time AQ reporting, and (d) what amount of error is “tolerable” for future LCS deployments.
In section 2 (Materials & Methods), we describe each step of the analysis and the sources of data used. Section 3 (Results) includes a comparison of the impact of different types and amounts of sensor measurement error for LCS at current PurpleAir locations, as well as a comparison across different LCS placement strategies and numbers of LCS deployed. Section 4 (Discussion) includes conclusions, limitations, and some ideas for future investigation.
2. Materials & Methods
2.1. Study Setting and Overview
To evaluate the potential impact of LCS measurements on localized AQ information accuracy, our study leverages real data on EPA monitor locations, PurpleAir LCS locations, and sociodemographic and geospatial features in California. Our choice to situate the study in California was primarily motivated by California’s widespread LCS uptake (California contains over half of the US’s PurpleAir sensors), in part prompted by concerns about increasing air pollution from wildfires29–32.
To ensure that our simulations accounted for realistic spatial and temporal variability in PM2.5, we assumed the “true” (error-free and comprehensive) daily ambient PM2.5 concentrations were those obtained from an ensemble model predicting PM2.5 exposures daily at 1 km × 1 km in 2016, created by Di et al. (2016 was the most recent year for which these predictions were available)33. These estimates agreed well with ground-based reference measurements: the 10-fold cross-validated R2 was 0.86 for the US overall and 0.80 for the Pacific coast states (including California). We also considered the current locations of EPA monitors (n=154 in California) to be fixed, and in our simulations, we set the PM2.5 measurements from each of these monitors to be equal to the Di et al. PM2.5 estimates at these locations (i.e., we assumed that there was no error in the measurements from these monitors).
Details of our simulation procedure are provided in the following subsections. Here is a brief overview to serve as a roadmap:
Using the placement strategy specified by the simulation scenario, select hypothetical locations of LCS
For each 1 km × 1 km grid in California, identify the grid centroid’s nearest AQ instrument (among the real EPA monitor locations and hypothetical LCS locations)
For each day in 2016, simulate LCS PM2.5 measurements by adding simulated device measurement error to the “true” PM2.5 estimates from Di et al. at each hypothetical LCS’s location
Evaluate the accuracy and equity of AQ information observed (based on the measurements from the nearest AQ instrument) across all 475,772 grids in California and 366 days in 2016
2.2. Selecting LCS Locations
Table S1 (in the Supporting Information) describes the data sets, data processing steps, and sampling methods used to select locations for LCS in each simulation.
To guide hypothetical LCS deployment strategies focused on environmentally and socially marginalized communities, we leveraged the CalEnviroScreen (CES) index, developed by the California Office of Environmental Health Hazard Assessment, which describes both environmental and socioeconomic-demographic marginalization at the Census tract level34, as well as its environmental component, henceforth referred to as the Pollution Score. The Pollution Score incorporates data on air pollution (ozone, PM2.5, diesel PM emissions, toxic chemical releases from facilities) and traffic density; pesticides, groundwater threats, impaired water bodies, and drinking water contamination index; solid waste, hazardous waste, and cleanup sites. The Pearson correlation between the Pollution Scores and the PM2.5 estimates from Di et al. is 0.48. The socioeconomic-demographic disadvantage index used by CES incorporates data on asthma, low birth weight, cardiovascular disease, education, linguistic isolation, poverty, unemployment, and housing burden. The CES Score is a product of these environmental and socioeconomic-demographic indices.
In this study, we considered the following five hypothetical LCS placement strategies, illustrated in Figure S1: (a) at randomly selected real outdoor PurpleAir locations, (b) at randomly selected public schools, (c) at randomly selected locations favoring proximity to major roads, (d) at randomly selected locations favoring high CES Score, and (e) at randomly selected locations favoring high Pollution Score. For the last three placement strategies, “favoring” refers to weighted random sampling (respectively using nearby road lengths, CES Score, and Pollution Score as weights) to determine placement locations. We compared each of these placement strategies across different numbers of sensors deployed (0, 50, 100, 250, 500, and 1000 LCS to show the trends). To provide context, average numbers of LCS assigned to the Los Angeles, Sacramento, Imperial counties (a large city, a medium-small city, and a well-known environmental justice focus area) under each placement strategy are provided in Table S2.
2.3. Observing AQ Information from the Nearest AQ Instrument
We assumed that all individuals in each 1 km × 1 km grid in California observed daily AQ measurements from the instrument (either EPA monitor or LCS based on simulated placement strategy) nearest to their grid centroid, as shown in Figure 1. In the True Air Pollution Exposure column (left), the background color in each grid cell represents the “true” air pollution that individuals experience (obtained from the Di et al. estimates). Note that these true exposures are most often not observed. The colors inside the triangles and circles represent the measurements from EPA monitors and LCS, respectively. LCS measurement error is represented in the bottom row, where the color inside the circle differs from the background color of the grid it’s in. In the Shown Air Pollution Exposure column (right), the background color in each grid is the air pollution measurement that individuals observe from their nearest AQ instrument. The differences between the AQ that individuals experience and the AQ that they are shown are indicated by the red and blue X’s in Figure 1. The red X’s indicate cells where AQ is over-classified, i.e., the AQ shown to residents is worse than the AQ truly experienced. The blue X’s indicate cells where AQ is under-classified, i.e., the AQ shown to residents is better than the AQ truly experienced.
Figure 1. AQ Information Reporting Diagram.

An illustration of the assumed process of AQ information reporting (from each grid’s nearest AQ instrument), under two scenarios: without LCS measurement error (top row of panels) and with LCS measurement error (bottom row of panels). In the True Exposure column, the background color of each grid represents the (unobserved) air quality an individual in that grid experiences, whereas in the Shown Exposure column, the background color represents the air quality an individual in that grid observes from their nearest AQ instrument.
Figure 1 illustrates three distinct sources of AQ reporting error to which we will refer throughout the rest of this paper: (i) distance to the nearest AQ instrument, (ii) local variability in air quality, and (iii) sensor measurement error. As an example of distance-based errors, the distance between an individual in A5 and the nearest AQ instrument is large, so an individual in A5 is unlikely to be shown accurate measurements of their air pollution exposure (in Figure 1, they would be shown that their exposure is 15 μg/m3 instead of the true exposure, which is 5 μg/m3). As an example of local variability-based errors, while an individual in C5 is close to an EPA monitor (D5), local variability in AQ between C5 and D5 results in misclassification of C5’s AQ (they would be shown that their exposure is 15 μg/m3, while their true exposure is 5 μg/m3). Even if a cell contains an LCS, sensor measurement error may still result in reporting error, as in the case of an individual in C2: under the setting of device measurement error, they would be shown that their exposure is 30 μg/m3, instead of their true exposure, which is 50 μg/m3. These effects can also co-occur, as for an individual in D2: the nearest AQ instrument is in cell C2, which, in addition to having lower air pollution than D2, also suffers from LCS measurement error. In this study, we help disentangle these effects.
2.4. Simulating Sensor Measurement Error
In each hypothetical (simulated) LCS deployment scenario, daily PM2.5 “measurements” from LCS were generated by adding sensor measurement error to the Di et al. PM2.5 estimates, in several different ways. First, we selected measurement error distributions informed by a tiered target for AQ instrument accuracy proposed at an EPA workshop35. The proposal is that AQ measurements (i) for regulatory purposes require accuracy of ±10% of the true average PM2.5 in that area, (ii) for mapping spatial gradients and monitoring microenvironments require accuracy of ±25%, and (iii) for hotspot detection require accuracy of ±50%. In our simulations, sensor measurement errors were generated both (a) differentially with respect to true PM2.5 and (b) non-differentially (with error magnitude not varying across PM2.5 levels). The former was motivated by empirical LCS observations, and accounts for the possibility of some spatial and temporal correlation in the sensor measurement errors due to some spatial and temporal smoothing induced by the Di et al. modeling approach. The latter assumes independence of the sensor measurement errors (post calibration of the LCS).
Second, we simulated LCS measurement errors in a manner enabling assessment of a nation-wide correction algorithm for PurpleAir sensors developed by EPA researchers36. Specifically, we sampled errors from the empirical distribution of residuals obtained by comparing measurements from EPA monitors in California to the corrected measurements from collocated PurpleAir sensors.
These procedures are detailed in the Supplemental Notes in the SI. Comparison of the characteristics and effects of all these different types and amounts of sensor measurement error is facilitated by Table 1 in the Results.
Table 1. Comparing Impacts of Different Sensor Accuracies.
Results (weighted by population density) when there are no LCS vs. LCS at all real PurpleAir locations (n = 4,343), assuming different kinds and amounts of sensor measurement error (ME), averaged across 100 simulation replicates to account for randomness in the sensor measurement error generation. Unless otherwise specified, “errors” refer to the difference between the true exposure experienced at each grid centroid and the exposure reported from the nearest AQ instrument. AQI under-classification is when the true exposure class is greater than what someone is shown, and over-classification is when the true exposure class is less than what someone is shown. Rate of UH misclassification (UHM) is the fraction of days with unhealthy AQI (Orange+) that are misreported as healthy AQI (Green or Yellow).
| Type/Amount of Sensor Measurement Error | Std. Dev. of Sensor Measurement Error (μg/m3) | MAE (μg/m3) | 95th Percentile of Errors (μg/m3) | Under-classified AQI (%) | Over-classified AQI (%) | UHM (%) |
|---|---|---|---|---|---|---|
| Overall Population | ||||||
| No LCS (only EPA monitors) | — | 1.46 | 4.45 | 2.05 | 6.79 | 11.37 |
| No Sensor Error | 0 | 0.79 | 2.79 | 2.12 | 2.37 | 15.02 |
| 10% Non-differential | 0.5 | 0.94 | 2.89 | 2.41 | 2.80 | 15.51 |
| 25% Non-differential | 1.25 | 1.33 | 3.52 | 3.07 | 4.13 | 16.27 |
| 10% Differential | 0.88 | 1.10 | 3.29 | 3.19 | 3.61 | 20.11 |
| 25% Differential | 2.19 | 1.85 | 5.39 | 5.04 | 6.45 | 28.38 |
| EPA Correction Residual Decile Draws | 3.32 | 2.45 | 7.16 | 8.27 | 6.02 | 27.10 |
| Population living in CBGs with high % nonwhite | ||||||
| No LCS (only EPA monitors) | — | 1.34 | 4.14 | 2.15 | 6.53 | 10.35 |
| No Sensor Error | 0 | 0.75 | 2.59 | 2.22 | 2.48 | 13.98 |
| 10% Non-differential | 0.5 | 0.89 | 2.71 | 2.53 | 2.95 | 14.57 |
| 25% Non-differential | 1.25 | 1.29 | 3.39 | 3.31 | 4.38 | 15.51 |
| 10% Differential | 0.92 | 1.08 | 3.21 | 3.45 | 3.84 | 19.87 |
| 25% Differential | 2.30 | 1.90 | 5.53 | 5.66 | 6.88 | 28.52 |
| EPA Correction Residual Decile Draws | 3.46 | 2.51 | 7.40 | 9.43 | 6.33 | 27.28 |
| Population living in CBGs with high % poverty | ||||||
| No LCS (only EPA monitors) | — | 1.28 | 4.06 | 2.09 | 5.79 | 8.40 |
| No Sensor Error | 0 | 0.81 | 2.80 | 2.22 | 2.68 | 10.06 |
| 10% Non-differential | 0.5 | 0.94 | 2.90 | 2.53 | 3.10 | 10.36 |
| 25% Non-differential | 1.25 | 1.31 | 3.52 | 3.31 | 4.40 | 11.38 |
| 10% Differential | 0.97 | 1.13 | 3.41 | 3.46 | 3.91 | 15.99 |
| 25% Differential | 2.44 | 1.94 | 5.75 | 5.78 | 6.71 | 25.55 |
| EPA Correction Residual Decile Draws | 3.65 | 2.54 | 7.59 | 9.60 | 6.16 | 24.20 |
2.5. Evaluating the AQ Information
The final step of each simulation was to evaluate the error between the true AQ exposures (which are most often unobserved) and the exposures reported by the nearest AQ instrument, summarized across all the grids and days. We evaluated accuracy in AQ reporting using the mean absolute error (MAE) in reported PM2.5 concentrations and the misclassification rates of the U.S. AQ index, or AQI37.
The AQI classifies AQ into six levels, with different public health recommendations for each. Green = “Good”, Yellow = “Moderate”, Orange = “Unhealthy for Sensitive Groups”, Red = “Unhealthy”, Purple = “Very Unhealthy”, and Maroon = “Hazardous”. AQI is often reported as a combination of air pollutants, however, for this analysis we used the single-pollutant version for PM2.538. We hypothesize that most people use these classifications to inform their activity rather than the exact concentrations of PM2.5 (or any other air pollutant), so we calculated the percent of over- and under-classifications, the percent of misclassifications greater than one level (e.g. Orange → Green or Yellow → Red), and what we term Unhealthy-Healthy Misclassifications (UHM): the fraction of days that a healthy (H) classification is shown, out of the days that are truly unhealthy (U). This last metric may be of the most concern for public health. For this dichotomous variable, we defined Green and Yellow to be healthy, and Orange through Maroon to be unhealthy.
When calculating these metrics, weighting each grid by its population density allows us to evaluate the accuracy of AQ information available to individuals in California. We also performed the calculations unweighted by population density, which represents averaging across the land area instead of averaging across individuals. However, as the population-weighted metrics are more relevant for public policy (e.g., for health impact assessments), we focus on these results in the main text; the unweighted results are provided in the SI.
For each combination of sensor placement strategy, number of sensors, and type/amount of sensor measurement error, we ran 100 replicates and averaged the metrics across them to account for random sampling variability. The results are robust to the number of replicates (i.e., 50 vs 100).
2.6. Equity Analysis
In addition to the overall population metrics (averaging across all grids in California), we calculated the metrics for marginalized subsets of the population, to determine if certain sensor placement strategies resulted in more equitable or less equitable access to AQ information.
We obtained socio-demographic features from the 2016 American Community Survey39 using the R package tidycensus40, at the finest spatial resolution for which they were available: Census block group (CBG) level for race/ethnicity and population density, and Census tract level for socioeconomic status. To merge these features with the AQ data, we performed an overlay of the CBG shapefile with the 1 km × 1 km grid centroids from the Di et al. estimates. Any block group or tract that did not contain a grid centroid was ignored. Although this procedure tends to exclude more CBGs with smaller land area and higher population density, our main analysis weighting by population density counteracts possible bias.
For this analysis, we used percentage of non-white individuals (all but non-Hispanic white) for the percent marginalized by race/ethnicity and the percent of people living under the poverty line for those marginalized by socioeconomic status. Our decision to use one minus the percent of non-Hispanic white people to represent disadvantage by race/ethnicity (elsewhere referred to as “% nonwhite” for verbal simplicity) was informed by a preliminary calculation showing that Hispanic white people on average experience higher pollution and socioeconomic disadvantage than the overall white population. We defined CBGs with high % nonwhite and high % poverty to be CBGs that fell into the top quintiles of these measures across the 1 km × 1 km grid centroids (≥58.1% and ≥23.5% respectively). For the 0.3% of CBGs that were missing data on % nonwhite, we substituted in Census tract-level % nonwhite. Data for % poverty was only available at the tract level. Only for one tract with population > 0 (tract 6037920200 with population 5,000) were we unable to retrieve tract-level data, and thus we omitted that tract from the analysis.
In summary, for the equity analysis we calculated the AQ reporting error in CBGs with high % poverty and high % nonwhite residents. Maps of % poverty and % nonwhite residents in California are provided in Figure S2.
2.7. Software Availability
Code used to download and process the datasets, run the analyses, and generate the figures and tables can be found at https://github.com/EllenConsidine/LCS_placement_sims
3. Results
3.1. Descriptive Statistics
Summary statistics of average annual PM2.5 (according to the 1km × 1km estimates by Di et al.), % poverty, CES Scores, % nonwhite, and population density, for population subgroups as well as locations (centroids of 1 km × 1 km grid cells) targeted in each of the LCS placement strategies, are shown in Table S4. One key observation, consistent with the environmental justice literature4–6, is that CBGs with high % poverty or high % nonwhite have higher annual average PM2.5 than the population overall. As shown in Figure S3, these marginalized subgroups also experience far more days classified as unhealthy by the AQI (level orange and higher) than the overall population. One differentiator between the two marginalized subgroups (high % nonwhite and high % poverty) is that CBGs with high % nonwhite tend to have higher population density than CBGs with high % poverty.
Another observation from Table S4, consistent with external findings14, is that among real EPA monitor locations, PurpleAir locations, and the hypothetical LCS placement strategies considered, PurpleAir locations have by far the lowest % poverty. By contrast, EPA monitor locations have higher % poverty than any of the LCS placement strategies considered. Among the LCS placement strategies considered, schools and locations favored by CES Score have the highest % poverty. LCS placements at schools also have the highest % nonwhite out of any of the EPA monitor, PurpleAir, or hypothetical LCS locations.
Finally, while schools and PurpleAir locations tend to be in CBGs with higher population density, locations chosen to favor proximity to roads, CES Score, and Pollution Score tend to have lower population density.
3.2. Comparing Different Types and Amounts of Sensor Measurement Error, Assuming LCS are Placed at Current PurpleAir Locations
The average distance to the nearest EPA monitor is 10.11 km for the population overall, 8.94 km for CBGs with high % nonwhite residents, and 8.69 km for CBGs with high % poverty. When we include LCS at all current locations of outdoor PurpleAir sensors, the average distance to the nearest AQ instrument drops dramatically, to 2.41 km, 2.49 km, and 2.82 km, respectively. Note that these results, like all those in the main text, are weighted by population density.
Table 1 summarizes how the accuracy of daily AQ information changes when we compare the scenario where people only have access to EPA monitors, with the scenario where people have access to EPA monitors and LCS at current locations of outdoor PurpleAir sensors. Under the scenario with LCS at PurpleAir locations, we compare the different sensor measurement error types, as described in the Methods. The first column of Table 1 shows the amount of sensor measurement error under each measurement error type (calculated as the standard deviation of the mean-zero simulated errors). For all the differential sensor measurement error scenarios (10%, 25%, and empirical residual-based), marginalized subgroups on average experience higher sensor measurement error because their air pollution exposure is higher.
Next, we use several metrics to describe the accuracy of observed AQ information under each scenario: (a) absolute error (deviation from the true exposure value), captured using the mean absolute error (MAE) and 95th percentile (to illustrate the upper end of the error distribution in addition to the mean), (b) rate of misclassification (either over- or under-classification) of the AQI, and (c) rate of Unhealthy-Healthy misclassifications (UHM), which we define as the fraction of days with unhealthy AQI that are misreported as healthy AQI.
Table 1 shows that when the LCS have no sensor measurement error (i.e. they are as accurate as EPA monitors in our simulations), deploying them at all the real locations of PurpleAir sensors roughly halves the MAE and 95th percentile of error in daily reported air quality. These improvements are smaller for CBGs with high % poverty, likely because PurpleAir sensors tend to be situated in more socioeconomically privileged areas.
Counterintuitively, even in the absence of sensor measurement error, placement of LCS at current PurpleAir locations leads to increases in the rates of under-classification of the AQI and UHMs. These reductions in classification accuracy are likely due to local variability in AQ and the fact that PurpleAir locations have lower annual average PM2.5 than the state overall (Table S4). This issue is exacerbated by sensor measurement error.
With non-differential sensor measurement error, LCS (at all the current PurpleAir locations) with error magnitudes of ±10% and ±25% both generally improve on the no-LCS case except for CBGs with high % poverty, where the MAE increases slightly. The impact of sensor measurement error is likely exacerbated for these CBGs with high % poverty due to the socioeconomic bias of PurpleAir locations. By contrast, while 10% differential sensor measurement error improves on the no-LCS case in terms of absolute error, 25% differential error and empirical residual-based error worsen the real-time AQ reporting for all groups and by all metrics, except for some small reductions in over-classification of the AQI.
These results highlight the potential for (i) LCS to reduce both the distance to the nearest AQ instrument and the absolute error in daily PM2.5 reporting, (ii) the accuracy of classification-based AQ information to diverge from the accuracy of concentration-based AQ information, (iii) different AQ information outcomes for different subsets of the population, which is related to LCS placement characteristics, and (iv) the dependence of these insights on the type and amount of sensor measurement error.
3.3. Additionally Comparing Placement Strategies and Numbers of Sensors Deployed
We now discuss how different hypothetical LCS placement strategies and numbers of LCS deployed affect access to real-time AQ information under both non-differential and differential sensor measurement error scenarios.
Figure 2 shows the average distance to the nearest AQ instrument as well as the MAE resulting from simulations under each hypothetical LCS deployment scenario (i.e., each combination of placement strategy and number of LCS deployed) and for each sensor measurement error type. The vertical scales of each plot are different, to facilitate close inspection of the lines. The plot for 10% non-differential measurement error is not shown because it is very similar to panel b (the results with no sensor measurement error).
Figure 2. Distance and MAE.

Distance to the nearest AQ instrument and mean absolute error (between what is reported vs. experienced) resulting from different numbers of LCS deployed, LCS placement strategies, and sensor measurement error types and amounts. All results were calculated using 366 days and averaged across 100 simulation replicates, weighted by population density. Panel a shows the distance to the nearest AQ instrument (monitor or LCS), panel b shows MAE when there is no LCS device measurement error, panel c shows MAE when LCS device measurement error is 25% non-differential, panel d shows MAE when LCS device measurement error is 10% differential, panel e shows MAE when LCS device measurement error is 25% differential, and panel f shows MAE when LCS device measurement error is sampled from the empirical distribution of residuals from PurpleAir LCS measurements (calibrated with the EPA equation) compared with collocated EPA monitor measurements.
The six panels in Figure 2 help distinguish between the three different components of error in real-time AQ reporting. First, we observe that with zero or low amounts of LCS measurement error, much of the average error in daily AQ reporting is due to individuals’ distance to the nearest AQ instrument, as illustrated by the similarity in line ordering and slopes between panels a, b, and d. Another observation, however, is that the impact of reduction in distance to the nearest AQ instrument can be confounded by local variability in AQ. For example, while LCS placements favoring CBGs with high Pollution Score result in greater reduction in distance to the nearest AQ instrument than placements favoring CBGs with high CES Score (panel a), the CES Score placement strategy results in lower MAE (panel b; the solid yellow line hides the solid orange line). This phenomenon is explained by local variability in AQ because the Pollution Score primarily highlights areas with large local sources of pollution, so measurements in those pollution “hotspots” may not be representative of air pollution even in nearby communities.
For low amounts of sensor measurement error (i.e. 10% non-differential and differential, as shown in panel d), reductions in daily AQ reporting error due to decreased distance to the nearest AQ instrument mitigate the impact of the LCS measurement error, improving MAEs across the board. When sensor measurement error is increased to 25% non-differential (panel c), deploying LCS only improves MAE under certain placement strategies: at schools and in current PurpleAir locations. This is largely because these placement strategies prioritize areas with higher population density.
With 25% differential and empirical residual-based sensor measurement errors (panels e and f), the impact of reduced distance to the nearest AQ instrument is overshadowed by the increased error in the sensor measurements, worsening AQ reporting across the board. Under these large amounts of sensor measurement error, placements favoring high CES Score result in the least error in AQ reporting. In nearly all cases, marginalized subgroups experience lower MAE than the population overall.
Figure 3 shows the rate of UHMs for all placement strategies, numbers of LCS deployed, and sensor measurement error types and amounts. One of the most noticeable patterns from Figure 3 is how the rate of UHMs increases when any LCS are introduced under all placement strategies and sensor measurement error types. This is especially true for LCS placements based on CES Score and Pollution Score (for the population overall and CBGs with high % nonwhite): the UHM rates stay basically constant despite the changing number of LCS. We posit that this is due to high local variability in AQ in Census tracts with high Pollution or CES Scores.
Figure 3. Unhealthy-Healthy Misclassifications.

Rates of UHMs resulting from different numbers of LCS deployed, LCS placement strategies, and sensor measurement error types and amounts. UH misclassification occurs when the AQ is unhealthy (Orange+) but is reported as healthy (Green or Yellow); the UHM rate is calculated by dividing the fraction of UH misclassifications by the total fraction of unhealthy days experienced by each group. All results were calculated using 366 days and averaged across 100 simulation replicates, weighted by population density. Panel a shows UHM rate when there is no LCS device measurement error, panel b shows UHM rate when LCS device measurement error is 10% non-differential, panel c shows UHM rate when LCS device measurement error is 25% non-differential, panel d shows UHM rate when LCS device measurement error is 10% differential, panel e shows UHM rate when LCS device measurement error is 25% differential, and panel f shows UHM rate when LCS device measurement error is sampled from the empirical distribution of residuals from PurpleAir LCS measurements (calibrated with the EPA equation) compared with collocated EPA monitor measurements.
Another important observation is that although marginalized subgroups experience a higher number of unhealthy days in absolute terms, the fraction of those unhealthy days misclassified as healthy is lower than for the population overall. And, crucially, the CES and Pollution Score-based placements lead to the lowest UHM rates for CBGs with high % poverty. However, school locations are the only placement strategy resulting in decreasing UHM rates for large numbers of LCS deployed (when sensor measurement error is nondifferential, in panels a through c). A similar reversal is observed in panel c of Figure 2 for marginalized subgroups. To investigate whether the school placement strategy might produce the lowest rate of UHMs for many LCS deployed, we ran simulations with LCS at all the schools (n=7,548). The results are summarized in Table S6. Notably, with 10% nondifferential sensor measurement error, this produces lower UHM rates than the EPA monitors alone. With 25% nondifferential sensor measurement error, the UHM rates are only slightly higher (e.g., 11.8% vs 11.4% for the population overall).
One last note on Figure 3 is that although generally increasing sensor measurement error increases the rate of UHMs, the empirical residual-based errors (panel f) result in slightly lower UHM rates than the 25% differential scenario (in panel e) for LCS at schools and PurpleAir locations.
Further insight can be gained from Figure S5, which illustrates the contributions of distance to nearest instrument, local variability in AQ, and sensor measurement error to large misclassifications of the AQI (off by more than one class) for each of the LCS placement strategies.
4. Discussion
In this study, we investigated the utility of including measurements from LCS in real-time AQ reporting using simulations based closely on real data. By comparing different types and amounts of sensor measurement error as well as different LCS placement strategies and numbers of LCS deployed, we were able to differentiate between the impacts of three components of error in daily AQ reporting: distance to the nearest AQ instrument, local variability in AQ, and sensor measurement error.
Our findings offer several key insights and suggestions. One of the most important is that the value of using LCS for real-time AQ reporting depends strongly on the amount and type of sensor measurement error, and also on the metric to which people pay attention (i.e., absolute concentration vs. AQI classification). Considering MAE, deploying LCS assuming 10% measurement error (either differential or non-differential) improves daily AQ reporting (compared to only using EPA monitors) across the board. However, among the placement strategies considered here, only LCS placements at schools and PurpleAir locations improve AQ reporting when sensor measurement error is assumed to be 25% and non-differential. Deploying LCS assuming 25% differential measurement error worsens daily AQ reporting across the board.
By contrast, introducing any number of LCS with any amount of sensor measurement error tends to increase the rate of UH misclassifications, which may be even more relevant than absolute PM2.5 concentrations for public health (whereas metrics like MAE may be more relevant for AQ-health science). This indicates that the existing EPA monitor network in California is relatively good at reporting whether the AQ is healthy or unhealthy. However, the UHM rate begins decreasing as more than 500 LCS are deployed at schools when sensor measurement error is assumed to be 10% or 25% and nondifferential. If sensor measurement error is 10% nondifferential, then deploying LCS at all schools in California (n=7,548) produces a lower rate of UHMs than EPA monitors alone (it is about the same if sensor measurement error is 25% nondifferential). Of course, organizations deploying LCS will need to balance these considerations with their budgets for purchasing and maintaining LCS.
Accounting for both absolute concentration and AQI classification metrics, it appears that placing LCS at schools results in the most accurate and equitable distribution of daily AQ information when sensor measurement error is assumed to be less than 25% and nondifferential and more than 500 LCS are deployed. The latter is quite realistic given that in California in 2021, there were 4,343 1 km × 1 km grids with PurpleAir, not to mention other brands of LCS. From a health standpoint, children’s relatively high vulnerability to air pollution41 further motivates the strategy of placing LCS at schools.
For our empirically-based simulation, the degree of error injected into our simulated LCS measurements was drawn from the empirical distribution of errors between collocated California EPA monitor and PurpleAir sensor data after correction using a national equation developed by EPA researchers36. This degree of measurement error is believed to most closely reflect error in the publicly available LCS measurements available through many programs and platforms today. Our simulations show that, under these conditions, using data from LCS in real-time AQ reporting would worsen the accuracy of information from people’s nearest AQ instrument, both for the overall population and for the marginalized groups considered. This result suggests that more region-specific LCS calibration procedures may be necessary for this application of LCS, which aligns with the findings of several recent studies advocating for region-specific corrections21,42.
When sensor measurement error is assumed to be high, LCS placement strategies that prioritize those burdened by environmental pollution and sociodemographic injustice result in the most equitable provision of AQ information. Also, across all levels of sensor measurement error and numbers of LCS deployed, selecting LCS placements based on CES Score results in the lowest rate of UHMs for CBGs with high % poverty. Lastly, our simulations revealed that AQ information accuracy under the CES and Pollution Score-based placements is often affected by high local variability in air quality, which makes sense because these locations tend to be near major sources of air pollution. This indicates that while placing LCS in environmental justice hotspots may benefit those in the immediate community, integrating their data into wider AQ reporting platforms may lead to worsening of real-time AQ information for those outside the immediate community.
Balancing policy priorities related to LCS deployment will not be easy. To balance the needs of people in less densely populated communities with those of people in more densely populated communities, our analysis unweighted by population density (results shown in the SI) suggests that some strategic deployments near major roads (especially in less densely populated areas) might also be beneficial.
These results can inform future investment in LCS networks for equitable AQ monitoring programs in the US, and the methods used can inform similar studies in other locales. While our findings are based on PurpleAir sensors for PM2.5, this work informs accuracy targets and larger concerns about LCS placement and calibration across brands of PM2.5 sensors, and possibly for other air pollutants. However, it is important to note that this analysis has focused on the use of LCS for provision of real-time AQ information to the public. These insights do not necessarily transfer to other applications of LCS. For example, when used for research purposes, LCS have been shown to help capture neighborhood-scale PM2.5 when fused with satellite data43 or incorporated into either a kriging model44–46 or a machine learning model with spatially-varying correction for the LCS24. That said, the simulation methodology developed for this study could be adapted for many other research questions.
4.1. Limitations and Future Directions
Design-related limitations of our study are that we only used data from California and that we assumed individuals view the AQ information from the instrument nearest to their location of residence as their personal exposure. A technical limitation of this analysis is our use of the Di et al. daily 1 km × 1 km estimates for the “true” exposures. AQ may vary substantially over 24 hours and over a 1 km × 1 km square, and as several studies have observed, the PM2.5 patterns sensed by LCS are often different (i.e. affected more by local sources) than EPA monitors47,48, which Di et al. used to train their model. We also did not consider any differences in LCS performance due to varying PM composition, which have been observed elsewhere35.
Future work might consider more nuance in the LCS measurement error problem, such as accounting for sensor “drift”20,49,50 and varying particle composition / meteorological conditions, as has been explored by EPA researchers who have proposed a different correction method for wildfire smoke measured by PurpleAir sensors51. This might be addressed by considering more spatial and temporal correlation in the LCS measurement errors. In terms of LCS placement, while we chose relatively simple selection strategies to facilitate comprehension and comparison, future research could harness more sophisticated statistical and/or atmospheric modeling techniques45,52,53 to identify locations yielding key spatiotemporal information or to prioritize some LCS placements near EPA monitors for the purposes of sensor calibration. Finally, there is more work to be done investigating how access to real-time AQ information and/or alerts translates into public health and economic benefits, as several studies have begun exploring54–58.
Supplementary Material
Synopsis:
How does real-time AQ information change as we deploy AQ sensors with different accuracies in different numbers and places?
Acknowledgments
We thank the National Studies on Air Pollution and Health (NSAPH) research lab, specifically the biostatistics working group led by Francesca Dominici, for their support of this project.
The computations in this paper were run on the FASRC Cannon cluster supported by the FAS Division of Science Research Computing Group at Harvard University.
Funding:
National Institutes of Health grant 5T32ES007142 (EMC)
National Institutes of Health grant 1K01ES032458 (RCN)
Footnotes
Competing interests: Authors declare that they have no competing interests.
- Supplemental notes on LCS device measurement error simulation.
- Supplemental figures and tables for all analyses: maps visualizing LCS placement strategies and distributions of marginalized groups, descriptions of contextual datasets and processing steps, annual average PM2.5 summaries and other descriptive statistics, hypothetical numbers of LCS in well-known counties under each placement strategy, distributions of simulated and empirical sensor measurement error, and summary statistics of the Di et al. estimates as used for the empirical sensor measurement error simulation.
- Supplemental figures and tables for the analysis weighted by population density: basic descriptive statistics, distance to the nearest AQ instrument among observations misclassified by more than one level of the AQI, underclassifications and overclassifications of the AQI, metrics with LCS deployed at all schools (n = 7,548).
- Supplemental figures and tables for the analysis unweighted by population density: counterparts of all figures and tables (from the weighted analysis) in the main text and SI.
Data and materials availability:
Code used to download and process the datasets, run the analyses, and generate the figures and tables can be found at https://github.com/EllenConsidine/LCS_placement_sims. All data used in this study are publicly available and our analytic (processed) dataset is on Harvard Dataverse, at https://doi.org/10.7910/DVN/QR4N7V. Descriptions of the data sources are in the Materials & Methods section as well as our GitHub README file.
References
- (1).Di Q; Dai L; Wang Y; Zanobetti A; Choirat C; Schwartz JD; Dominici F Association of Short-Term Exposure to Air Pollution with Mortality in Older Adults. JAMA 2017, 318 (24), 2446–2456. 10.1001/jama.2017.17923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (2).Di Q; Wang Y; Zanobetti A; Wang Y; Koutrakis P; Choirat C; Dominici F; Schwartz JD Air Pollution and Mortality in the Medicare Population. N. Engl. J. Med 2017, 376 (26), 2513–2522. 10.1056/NEJMoa1702747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (3).US EPA. How Does PM Affect Human Health? | Air Quality Planning Unit | Ground-level Ozone | New England | US EPA. https://www3.epa.gov/region1/airquality/pm-human-health.html (accessed 2022-05-06).
- (4).Jbaily A; Zhou X; Liu J; Lee T-H; Kamareddine L; Verguet S; Dominici F Air Pollution Exposure Disparities across US Population and Income Groups. Nature 2022, 601 (7892), 228–233. 10.1038/s41586-021-04190-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (5).Colmer J; Hardman I; Shimshack J; Voorheis J Disparities in PM2.5 Air Pollution in the United States. Science 2020, 369 (6503), 575–578. 10.1126/science.aaz9353. [DOI] [PubMed] [Google Scholar]
- (6).Tessum CW; Apte JS; Goodkind AL; Muller NZ; Mullins KA; Paolella DA; Polasky S; Springer NP; Thakrar SK; Marshall JD; Hill JD Inequity in Consumption of Goods and Services Adds to Racial–Ethnic Disparities in Air Pollution Exposure. Proc. Natl. Acad. Sci 2019, 116 (13), 6001–6006. 10.1073/pnas.1818859116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (7).Orru H; Ebi KL; Forsberg B The Interplay of Climate Change and Air Pollution on Health. Curr. Environ. Health Rep 2017, 4 (4), 504–513. 10.1007/s40572-017-0168-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (8).Miller D National Park Service. https://www.nps.gov/articles/000/a-low-cost-air-quality-sensor-for-measuring-particulate-matter.htm (accessed 2022-01-03).; A Low-Cost Air Quality Sensor for Measuring Particulate Matter. Intermt. Park Sci 2021, 11. [Google Scholar]
- (9).Keller JP; Peng RD Error in Estimating Area‐level Air Pollution Exposures for Epidemiology. Environmetrics 2019, 30 (8). 10.1002/env.2573. [DOI] [Google Scholar]
- (10).English PB; Olmedo L; Bejarano E; Lugo H; Murillo E; Seto E; Wong M; King G; Wilkie A; Meltzer D; Carvlin G; Jerrett M; Northcross A The Imperial County Community Air Monitoring Network: A Model for Community-Based Environmental Monitoring for Public Health Action. Environ. Health Perspect 2017, 125 (7), 074501. 10.1289/EHP1772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (11).Grainger C; Schreiber A Discrimination in Ambient Air Pollution Monitoring? AEA Pap. Proc 2019, 109, 277–282. [Google Scholar]
- (12).Watson JG; Chow JC; DuBois D; Green M; Frank N Guidance for the Network Design and Optimum Site Exposure for PM2.5 and PM10; PB-99–157513/XAB; EPA-454/R-99/022; Environmental Protection Agency, Office of Air Quality Planning and Standards, Research Triangle Park, NC (United States); Nevada Univ. System, Desert Research Inst., Reno, NV (United States); National Oceanic and Atmospheric Administration, Las Vegas, NV (United States), 1997. https://www.osti.gov/biblio/678946-guidance-network-design-optimum-site-exposure-pm2-pm10 (accessed 2022-01-03). [Google Scholar]
- (13).US EPA. How to Use Air Sensors: Air Sensor Guidebook. https://www.epa.gov/air-sensor-toolbox/how-use-air-sensors-air-sensor-guidebook (accessed 2021-11-14).
- (14).deSouza P; Kinney PL On the Distribution of Low-Cost PM2.5 Sensors in the US: Demographic and Air Quality Associations. J. Expo. Sci. Environ. Epidemiol 2021, 31 (3), 514–524. 10.1038/s41370-021-00328-2. [DOI] [PubMed] [Google Scholar]
- (15).deSouza PN Key Concerns and Drivers of Low-Cost Air Quality Sensor Use. Sustainability 2022, 14 (1), 584. 10.3390/su14010584. [DOI] [Google Scholar]
- (16).Lu Y; Giuliano G; Habre R Estimating Hourly PM2.5 Concentrations at the Neighborhood Scale Using a Low-Cost Air Sensor Network: A Los Angeles Case Study. Environ. Res 2021, 195, 110653. 10.1016/j.envres.2020.110653. [DOI] [PubMed] [Google Scholar]
- (17).Jayaratne R; Liu X; Ahn K-H; Asumadu-Sakyi A; Fisher G; Gao J; Mabon A; Mazaheri M; Mullins B; Nyakcrillu M; Ristovski Z; Scorgie Y; Thai P; Dunbabin M; Morawska L Low-Cost PM2.5 Sensors: An Assessment of Their Suitability for Various Applications. Aerosol Air Qual. Res 2020. 10.4209/aaqr.2018.10.0390. [DOI] [Google Scholar]
- (18).Crilley LR; Singh A; Kramer LJ; Shaw MD; Alam MS; Apte JS; Bloss WJ; Hildebrandt Ruiz L; Fu P; Fu W; Gani S; Gatari M; Ilyinskaya E; Lewis AC; Ng’ang’a D; Sun Y; Whitty RCW; Yue S; Young S; Pope FD Effect of Aerosol Composition on the Performance of Low-Cost Optical Particle Counter Correction Factors. Atmospheric Meas. Tech 2020, 13 (3), 1181–1193. 10.5194/amt-13-1181-2020. [DOI] [Google Scholar]
- (19).Liu X; Jayaratne R; Thai P; Kuhn T; Zing I; Christensen B; Lamont R; Dunbabin M; Zhu S; Gao J; Wainwright D; Neale D; Kan R; Kirkwood J; Morawska L Low-Cost Sensors as an Alternative for Long-Term Air Quality Monitoring. Environ. Res 2020, 185, 109438. 10.1016/j.envres.2020.109438. [DOI] [PubMed] [Google Scholar]
- (20).Considine EM; Reid CE; Ogletree MR; Dye T Improving Accuracy of Air Pollution Exposure Measurements: Statistical Correction of a Municipal Low-Cost Airborne Particulate Matter Sensor Network. Environ. Pollut 2021, 268, 115833. 10.1016/j.envpol.2020.115833. [DOI] [PubMed] [Google Scholar]
- (21).Zusman M; Schumacher CS; Gassett AJ; Spalt EW; Austin E; Larson TV; Carvlin G; Seto E; Kaufman JD; Sheppard L Calibration of Low-Cost Particulate Matter Sensors: Model Development for a Multi-City Epidemiological Study. Environ. Int 2020, 134, 105329. 10.1016/j.envint.2019.105329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (22).Malings C; Tanzer R; Hauryliuk A; Saha PK; Robinson AL; Presto AA; Subramanian R Fine Particle Mass Monitoring with Low-Cost Sensors: Corrections and Long-Term Performance Evaluation. Aerosol Sci. Technol 2020, 54 (2), 160–174. 10.1080/02786826.2019.1623863. [DOI] [Google Scholar]
- (23).Heffernan C; Peng R; Gentner DR; Koehler K; Datta A Gaussian Process Filtering for Calibration of Low-Cost Air-Pollution Sensor Network Data. Submitted 2022-03-28. ArXiv (Stat.) ArXiv220314775 (accessed 2022-05-06). [Google Scholar]
- (24).Bi J; Stowell J; Seto EYW; English PB; Al-Hamdan MZ; Kinney PL; Freedman FR; Liu Y Contribution of Low-Cost Sensor Measurements to the Prediction of PM2.5 Levels: A Case Study in Imperial County, California, USA. Environ. Res 2020, 180, 108810. 10.1016/j.envres.2019.108810. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (25).Morawska L; Thai PK; Liu X; Asumadu-Sakyi A; Ayoko G; Bartonova A; Bedini A; Chai F; Christensen B; Dunbabin M; Gao J; Hagler GSW; Jayaratne R; Kumar P; Lau AKH; Louie PKK; Mazaheri M; Ning Z; Motta N; Mullins B; Rahman MM; Ristovski Z; Shafiei M; Tjondronegoro D; Westerdahl D; Williams R Applications of Low-Cost Sensing Technologies for Air Quality Monitoring and Exposure Assessment: How Far Have They Gone? Environ. Int 2018, 116, 286–299. 10.1016/j.envint.2018.04.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (26).Clarity. Download - Guide to Maximizing your Air Quality Budget in a Post-COVID World. Clarity. https://www.clarity.io/landing-pages/download-guide (accessed 2022-01-27). [Google Scholar]
- (27).Live Animated Air Quality Map. IQAir Map. https://www.iqair.com/us/air-quality-map (accessed 2022-11-03). [Google Scholar]
- (28).AirNow Fire and Smoke Map. https://fire.airnow.gov/# (accessed 2022-11-03).
- (29).Burke M; Driscoll A; Heft-Neal S; Xue J; Burney J; Wara M The Changing Risk and Burden of Wildfire in the United States. Proc. Natl. Acad. Sci 2021, 118 (2), e2011048118. 10.1073/pnas.2011048118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (30).Goss M; Swain DL; Abatzoglou JT; Sarhadi A; Kolden CA; Williams AP; Diffenbaugh NS Climate Change Is Increasing the Likelihood of Extreme Autumn Wildfire Conditions across California. Environ. Res. Lett 2020, 15 (9), 094016. 10.1088/1748-9326/ab83a7. [DOI] [Google Scholar]
- (31).Gupta P; Doraiswamy P; Levy R; Pikelnaya O; Maibach J; Feenstra B; Polidori A; Kiros F; Mills KC Impact of California Fires on Local and Regional Air Quality: The Role of a Low‐Cost Sensor Network and Satellite Observations. GeoHealth 2018, 2 (6), 172–181. 10.1029/2018GH000136. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (32).Liu JC; Wilson A; Mickley LJ; Dominici F; Ebisu K; Wang Y; Sulprizio MP; Peng RD; Yue X; Son J-Y; Anderson GB; Bell ML Wildfire-Specific Fine Particulate Matter and Risk of Hospital Admissions in Urban and Rural Counties: Epidemiology 2017, 28 (1), 77–85. 10.1097/EDE.0000000000000556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (33).Di Q An Ensemble-Based Model of PM2.5 Concentration across the Contiguous United States with High Spatiotemporal Resolution. Environ. Int 2019, 13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (34).August L CalEnviroScreen 3.0. OEHHA. https://oehha.ca.gov/calenviroscreen/report/calenviroscreen-30 (accessed 2021-11-06). [Google Scholar]
- (35).Williams R; Duvall R; Kilaru V; Hagler G; Hassinger L; Benedict K; Rice J; Kaufman A; Judge R; Pierce G; Allen G; Bergin M; Cohen RC; Fransioli P; Gerboles M; Habre R; Hannigan M; Jack D; Louie P; Martin NA; Penza M; Polidori A; Subramanian R; Ray K; Schauer J; Seto E; Thurston G; Turner J; Wexler AS; Ning Z Deliberating Performance Targets Workshop: Potential Paths for Emerging PM2.5 and O3 Air Sensor Progress. Atmospheric Environ. X 2019, 2, 100031. 10.1016/j.aeaoa.2019.100031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (36).Barkjohn KK; Gantt B; Clements AL Development and Application of a United States-Wide Correction for PM2.5; Data Collected with the PurpleAir Sensor. Atmospheric Meas. Tech 2021, 14 (6), 4617–4637. 10.5194/amt-14-4617-2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (37).AQI Basics | AirNow.gov. https://www.airnow.gov/aqi/aqi-basics (accessed 2021-11-06).
- (38).The AQI Equation - Air Quality and AQI Info. AirNow Discussion Forum. https://forum.airnowtech.org/t/the-aqi-equation/169 (accessed 2021-11-06). [Google Scholar]
- (39).Bureau, A. C. S. O. | U. C. Data Profiles. The United States Census Bureau. https://www.census.gov/programs-surveys/acs/data/eeo-data/ (accessed 2022-01-27). [Google Scholar]
- (40).Walker K Tidycensus. https://cran.r-project.org/web/packages/tidycensus/index.html.
- (41).European Center for Environment and Health, Bonn Office. Effect of Air Pollution on Children’s Health and Development: A Review of the Evidence; World Health Organization, Europe, 2005. [Google Scholar]
- (42).Chu H-J; Ali MZ; He Y-C Spatial Calibration and PM2.5 Mapping of Low-Cost Air Quality Sensors. Sci. Rep 2020, 10 (1), 22079. 10.1038/s41598-020-79064-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (43).Huang K; Bi J; Meng X; Geng G; Lyapustin A; Lane KJ; Gu D; Kinney PL; Liu Y Estimating Daily PM2.5 Concentrations in New York City at the Neighborhood-Scale: Implications for Integrating Non-Regulatory Measurements. Sci. Total Environ 2019, 697, 134094. 10.1016/j.scitotenv.2019.134094. [DOI] [PubMed] [Google Scholar]
- (44).Lu T; Bechle MJ; Wan Y; Presto AA; Hankey S Using Crowd-Sourced Low-Cost Sensors in a Land Use Regression of PM2.5 in 6 US Cities. Air Qual. Atmosphere Health 2022, 15 (4), 667–678. 10.1007/s11869-022-01162-7. [DOI] [Google Scholar]
- (45).Bi J; Carmona N; Blanco MN; Gassett AJ; Seto E; Szpiro AA; Larson TV; Sampson PD; Kaufman JD; Sheppard L Publicly Available Low-Cost Sensor Measurements for PM2.5 Exposure Modeling: Guidance for Monitor Deployment and Data Selection. Environ. Int 2022, 158, 106897. 10.1016/j.envint.2021.106897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (46).Gressent A; Malherbe L; Colette A; Rollin H; Scimia R Data Fusion for Air Quality Mapping Using Low-Cost Sensor Observations: Feasibility and Added-Value. Environ. Int 2020, 143, 105965. 10.1016/j.envint.2020.105965. [DOI] [PubMed] [Google Scholar]
- (47).Wang W-CV; Lung S-CC; Liu C-H Application of Machine Learning for the In-Field Correction of a PM2.5 Low-Cost Sensor Network. Sensors 2020, 20 (17), 5002. 10.3390/s20175002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (48).Tanzer R; Malings C; Hauryliuk A; Subramanian R; Presto AA Demonstration of a Low-Cost Multi-Pollutant Network to Quantify Intra-Urban Spatial Variations in Air Pollutant Source Impacts and to Evaluate Environmental Justice. Int. J. Environ. Res. Public. Health 2019, 16 (14), 2523. 10.3390/ijerph16142523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (49).Delaine F; Lebental B; Rivano H In Situ Calibration Algorithms for Environmental Sensor Networks: A Review. IEEE Sens. J 2019, 19 (15), 5968–5978. 10.1109/JSEN.2019.2910317. [DOI] [Google Scholar]
- (50).Malings C; Tanzer R; Hauryliuk A; Kumar SPN; Zimmerman N; Kara LB; Presto AA; Subramanian R. Development of a General Calibration Model and Long-Term Performance Evaluation of Low-Cost Sensors for Air Pollutant Gas Monitoring. Atmospheric Meas. Tech 2019, 12 (2), 903–920. 10.5194/amt-12-903-2019. [DOI] [Google Scholar]
- (51).Johnson K; Holder A; Frederick S; Clements A. PurpleAir PM2.5 U.S. Correction and Performance During Smoke Events; EPA, 2020. https://cfpub.epa.gov/si/si_public_record_Report.cfm?dirEntryId=349513&Lab=CEMM (accessed 2021-11-06). [Google Scholar]
- (52).Kelp M; Lin S; Kutz JN; Mickley LJ A New Approach for Determining Optimal Placement of PM2.5 Air Quality Sensors: Case Study for the Contiguous United States. Environmental Research Letters 2022, 17 (3), 034034. 10.1088/1748-9326/ac548f. [DOI] [Google Scholar]
- (53).Diggle PJ and Giorgi E Preferential sampling of exposure levels. In Handbook of Environmental Statistics (2019), 477–490. Boca Raton: CRC Press. [Google Scholar]
- (54).D’Antoni D; Auyeung V; Walton H; Fuller GW; Grieve A; Weinman J The Effect of Evidence and Theory-Based Health Advice Accompanying Smartphone Air Quality Alerts on Adherence to Preventative Recommendations during Poor Air Quality Days: A Randomised Controlled Trial. Environ. Int 2019, 124, 216–235. 10.1016/j.envint.2019.01.002. [DOI] [PubMed] [Google Scholar]
- (55).Saberian S; Heyes A; Rivers N Alerts Work! Air Quality Warnings and Cycling. Resour. Energy Econ 2017, 49, 165–185. 10.1016/j.reseneeco.2017.05.004. [DOI] [Google Scholar]
- (56).Wen X-J; Balluz L; Mokdad A Association Between Media Alerts of Air Quality Index and Change of Outdoor Activity Among Adult Asthma in Six States, BRFSS, 2005. J. Community Health 2009, 34 (1), 40–46. 10.1007/s10900-008-9126-4. [DOI] [PubMed] [Google Scholar]
- (57).Neidell M; Kinney PL Estimates of the Association between Ozone and Asthma Hospitalizations That Account for Behavioral Responses to Air Quality Information. Environ. Sci. Policy 2010, 13 (2), 97–103. 10.1016/j.envsci.2009.12.006. [DOI] [Google Scholar]
- (58).Buonocore JJ; Robinson LA; Hammitt JK; O’Keeffe L Estimating the Potential Health Benefits of Air Quality Warnings. Risk Anal. 2021, 41 (4), 645–660. 10.1111/risa.13640. [DOI] [PubMed] [Google Scholar]
- (59).AirData website File Download page. https://aqs.epa.gov/aqsweb/airdata/download_files.html (accessed 2021-11-14)
- (60).Sardegna C. GitHub. https://github.com/ReagentX/purple_air_api (accessed 2021-04-16).
- (61).School Locations & Geoassignments 2020–2021, Education Demographic and Geographic Estimates | National Center for Education Statistics; (accessed 2021-11-06). https://nces.ed.gov/programs/edge/geographic/schoollocations [Google Scholar]
- (62).National Highway Planning Network | U.S. D.O.T. Federal Highway Administration, https://www.fhwa.dot.gov/planning/processes/tools/nhpn/index.cfm (accessed 2021-11-06).
- (63).Economic Research Service, U.S. Department of Agriculture. “What is Rural?” https://www.ers.usda.gov/topics/rural-economy-population/rural-classifications/what-is-rural.aspx#:~: (accessed 2022-01-03).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Code used to download and process the datasets, run the analyses, and generate the figures and tables can be found at https://github.com/EllenConsidine/LCS_placement_sims. All data used in this study are publicly available and our analytic (processed) dataset is on Harvard Dataverse, at https://doi.org/10.7910/DVN/QR4N7V. Descriptions of the data sources are in the Materials & Methods section as well as our GitHub README file.
