Towards a Unified Terminology of Processing Levels for Low-cost Air-Quality Sensors

Philipp Schneider; Alena Bartonova; Nuria Castell; Franck R Dauge; Michel Gerboles; Gayle S W Hagler; Christoph Hüglin; Roderic L Jones; Sean Kahn; Alastair C Lewis; Bas Mijling; Michael Müller; Michele Penza; Laurent Spinelle; Brian Stacey; Matthias Vogt; Joost Wesseling; Ronald W Williams

doi:10.1021/acs.est.9b03950

. Author manuscript; available in PMC: 2021 Feb 16.

Published in final edited form as: Environ Sci Technol. 2019 Jul 29;53(15):8485–8487. doi: 10.1021/acs.est.9b03950

Towards a Unified Terminology of Processing Levels for Low-cost Air-Quality Sensors

Philipp Schneider ¹, Alena Bartonova ¹, Nuria Castell ¹, Franck R Dauge ¹, Michel Gerboles ², Gayle S W Hagler ³, Christoph Hüglin ⁴, Roderic L Jones ⁵, Sean Kahn ⁶, Alastair C Lewis ⁷, Bas Mijling ⁸, Michael Müller ⁴, Michele Penza ⁹, Laurent Spinelle ¹⁰, Brian Stacey ¹¹, Matthias Vogt ¹, Joost Wesseling ¹², Ronald W Williams ³

PMCID: PMC7886280 NIHMSID: NIHMS1539801 PMID: 31353903

Low-cost sensor systems for measuring air quality have received widespread scientific and media attention over recent years. It has become an established technical methodology to improve the data quality of such sensor systems by colocating them at traditional air quality monitoring stations equipped with reference instrumentation and field-calibrating individual units using various statistical techniques. Methods range from (multi)linear regression to more complex statistical techniques, often using additional predictor variables such as air temperature or relative humidity (e.g., Spinelle et al.(4)), and occasionally data not actually measured by the sensor system itself (e.g., station observations or model output). Most of these techniques improve the level of agreement between sensor-derived data and reference data, in many cases eliminating issues such as chemical interferences and sensor-to-sensor variability. It is not always clear, however, the extent to which the data arising from such processing are still a true and independent measurement by the sensor system, or some blend of secondary data and model prediction. Noticing this development, Hagler et al. (2018)(2) warned that some systems may use predictor variables for calibration in such a way that a line is crossed from justifiable and empirical correction of a known artifact to a method that is essentially a predictive statistical model. In addition, the processing steps that are carried out along the way are often not clearly communicated. The current lack of governmental or third-party standards for low-cost sensor performance(5) and occasional lack of distinction between sensors and sensor systems further complicates data processing.

Adding to the observations and recommendations made by Hagler et al. (2018)(2), we have further noticed that there is substantial and consistent confusion within both the scientific community and the interested public regarding the amount and type of processing applied to sensor data, and at what point derived data can be considered to have lost a meaningful link to quantitative traceability. The relevance of this issue to air quality sensors is significant since in most countries air quality targets and standards are set out in primary legislation and measured attainment of those targets has demanding traceability requirements. Clarity regarding the level of sensor data processing is important for evaluation of sensor technology, as well as correct use and interpretation of its data.

To address this challenge we propose a unified terminology of processing levels for low-cost air quality sensor systems. A strict sequence of processing levels is already common practice in satellite remote sensing, where it has been in wide use across multiple agencies for decades.(1) We have adapted these levels and suggest a sequence of processing levels for data from low-cost air quality sensor systems (Table1).

Table 1:

Proposed Processing Levels for Low-Cost Sensor Systems for Air Quality ^a

Level	Name	Definition	Example: Gas sensors	Example: Particle sensors
Level-0	Raw measurements	Original measurand produced by the sensor system	Voltage corresponding to measured quantity, e.g. current for electrochemical sensors, resistance or conductance for metal oxide sensors or transmitted light intensity for infra-red sensors	Voltage corresponding to light scattered by nephelometers, or to particle counts for bins of optical particle counters
Level-1	Intermediate geophysical quantities	Estimate derived from corresponding Level-0 data, using basic physical principles or simple calibration equations, and no compensation schemes.	For electrochemical sensors, NO₂ concentration in μg/m³ or ppb, using only Level-0 data from the NO₂ sensor itself with no additional corrections beyond factory calibration. Essentially “raw data in concentration units”.	Binned particle counts or PM mass in μg/m³ derived from Level-0 data using simple calibration and assumed particle density
Level-2A	Standard geophysical quantities	Estimate using sensor plus other on-board sensors demonstrated as appropriate to use for artifact correction and directly related to measurement principle.	NO₂ concentration in μg/m³ or ppb, derived from onboard NO₂/NO/O₃ sensors, corrected for interferences and/or T/RH effects using onboard data	PM concentration in μg/m³, corrected for T/RH effects with onboard-measured T/RH
Level-2B	Standard geophysical quantities-extended	As Level-2A but also using external data demonstrated as appropriate to use for artifact correction and directly related to measurement principle	As Level-2A but using external T/RH from nearby station	As Level-2A but using external T/RH from nearby station
Measurement/prediction boundary
Level-3	Advanced geophysical quantities	Estimate using sensor plus internal/external data to adjust values, not constrained to data inputs proven as causes of measurement bias or related to measurement principle	NO₂ concentration in μg/m³ or ppb, corrected for T/RH effects, and using data from nearby meteorological stations or models	PM concentration in μg/m³, corrected for T/RH effects and using data from nearby stations or models
Level-4	Spatially continuous geophysical quantities	Spatially continuous maps derived from network of distributed sensor systems	Map of NO₂ concentrations in μg/m³ or ppb, e.g. derived using assimilation of sensor network data into physical model	Map of PM_2.5 concentrations in μg/m³, e.g. derived using assimilation of sensor network data into physical model

Open in a new tab

^a

T/RH stands for temperature and relative humidity. The spatial support of all Levels except Level-4 is point measurements at single locations or for entire networks.

^b

See Hagler et al. (2018).(2)

The proposed processing levels range from Level-0, indicating output from the electronically interfaced raw sensor signal, to Level-4, representing a spatially continuous map of concentrations derived from a network of sensor systems, for example using spatial interpolation or data assimilation into a chemical transport model.(3) The levels therefore represent a sequence from least processed to most processed information. Loosely mirroring the processing levels typically used for satellite data, Level-0 represents raw instrument output, Level-2 represents the standard product used for most scientific applications, and Level-4 represents a combination of the data with other spatial data sources (e.g., a model). However, in the specifics the proposed levels differ from those used in remote sensing to accommodate the unique requirements of low-cost sensor data.

The usability of data at each processing level depends on the end-user application. Level-1 or −2 data, if it meets the right standards, may be useful for measuring progress against air quality targets. Level-4 is a blended product using data from multiple sources that may be most useful and applicable for public information systems. Note that the level designation merely represents the amount of processing carried out to the data set and does not reflect data quality. The latter needs to be ensured using appropriate QA/QC strategies when sensor systems are deployed. The levels further do not have to be passed sequentially but can be labels describing the approximate amount of processing applied to a data product (e.g., directly going from Level-0 to Level-2). The levels do not imply anything about processing location (e.g., in the sensor system itself or in the cloud) or whether data is available in near real-time. Most of the sensors systems that can be readily purchased on the market nowadays offer Level-1 or Level-2 data, although some open systems also provide Level-0 data. We consider the step from Level-2 to Level-3 as the transition point from true measurements to a type of statistical prediction or modeling. All levels except Level-4 apply to individual sensor systems as well as entire networks. However, exploiting the “network knowledge” can add significant value to the data. Such cases are mostly covered by Level-3, but, once more mature, network-based processing techniques could conceivably receive their own terminology.

It should be noted that we do not believe that any of the described levels are inherently better than others, they simply serve different purposes and user communities. However, we do think it is essential that the type and amount of processing performed on a given sensor data set is communicated transparently so that the data users can make informed decisions. This is particularly important for scientific, operational, and policy applications where methods have to be thoroughly documented and their fitness for purpose demonstrated.

We believe that the presented harmonized terminology of processing levels can contribute toward this goal without requiring the sensor manufacturers to necessarily publicize their proprietary algorithms (although entirely open systems are preferable, particularly for scientific applications). It is further our hope that adoption of the suggested processing levels (or a derivation) within the community will contribute to simplifying and improving the communication between manufacturers, researchers, and other users. Overall, we think that a unified terminology is a first step toward improved data integrity and transparency and that it will ultimately lead to a better use of this new technology.

Footnotes

The authors declare no competing financial interest.

The views expressed in this paper are those of the authors and do not necessarily reflect the views or policies of the United States Environmental Protection Agency. It has been subjected to Agency review and approved for publication. Mention of trade names or commercial products does not constitute an endorsement or recommendation for use.

References

EOS Data Panel (1986). Earth Observing System: Report of the EOS Data Panel. Data and information system. Volume IIa. National Aeronautics and Space Administration, Goddard Space Flight Center. [Google Scholar]
Hagler GS, Williams R, Papapostolou V, and Polidori A. (2018). Air Quality Sensors and Data Adjustment Algorithms: When Is It No Longer a Measurement? Environmental Science and Technology, 52(10):5530–5531. [DOI] [PMC free article] [PubMed] [Google Scholar]
Schneider P, Castell N, Vogt M, Dauge FR, Lahoz WA, and Bartonova A. (2017). Mapping urban air quality in near real-time using observations from low-cost sensors and model information. Environment International, 106(May):234–247. [DOI] [PubMed] [Google Scholar]
Spinelle L, Gerboles M, Villani MG, Aleixandre M, & Bonavitacola F. (2015). Field calibration of a cluster of low-cost available sensors for air quality monitoring. Part A: Ozone and nitrogen dioxide. Sensors and Actuators B: Chemical, 215, 249–257. [Google Scholar]
Williams R, Duvall R, Kilaru V, Hagler G, Benedict K, Rice J, Kaufman A, et al. Deliberating Performance Targets Workshop: Potential Paths for Sensor Progress. Atmospheric Environment. Published April 19, 2019, 10.1016/j.aeaoa.2019.100031 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] EOS Data Panel (1986). Earth Observing System: Report of the EOS Data Panel. Data and information system. Volume IIa. National Aeronautics and Space Administration, Goddard Space Flight Center. [Google Scholar]

[R2] Hagler GS, Williams R, Papapostolou V, and Polidori A. (2018). Air Quality Sensors and Data Adjustment Algorithms: When Is It No Longer a Measurement? Environmental Science and Technology, 52(10):5530–5531. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] Schneider P, Castell N, Vogt M, Dauge FR, Lahoz WA, and Bartonova A. (2017). Mapping urban air quality in near real-time using observations from low-cost sensors and model information. Environment International, 106(May):234–247. [DOI] [PubMed] [Google Scholar]

[R4] Spinelle L, Gerboles M, Villani MG, Aleixandre M, & Bonavitacola F. (2015). Field calibration of a cluster of low-cost available sensors for air quality monitoring. Part A: Ozone and nitrogen dioxide. Sensors and Actuators B: Chemical, 215, 249–257. [Google Scholar]

[R5] Williams R, Duvall R, Kilaru V, Hagler G, Benedict K, Rice J, Kaufman A, et al. Deliberating Performance Targets Workshop: Potential Paths for Sensor Progress. Atmospheric Environment. Published April 19, 2019, 10.1016/j.aeaoa.2019.100031 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Towards a Unified Terminology of Processing Levels for Low-cost Air-Quality Sensors

Philipp Schneider

Alena Bartonova

Nuria Castell

Franck R Dauge

Michel Gerboles

Gayle S W Hagler

Christoph Hüglin

Roderic L Jones

Sean Kahn

Alastair C Lewis

Bas Mijling

Michael Müller

Michele Penza

Laurent Spinelle

Brian Stacey

Matthias Vogt

Joost Wesseling

Ronald W Williams

Table 1:

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Towards a Unified Terminology of Processing Levels for Low-cost Air-Quality Sensors

Philipp Schneider

Alena Bartonova

Nuria Castell

Franck R Dauge

Michel Gerboles

Gayle S W Hagler

Christoph Hüglin

Roderic L Jones

Sean Kahn

Alastair C Lewis

Bas Mijling

Michael Müller

Michele Penza

Laurent Spinelle

Brian Stacey

Matthias Vogt

Joost Wesseling

Ronald W Williams

Table 1:

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases