Skip to main content
. 2021 May 11;9:72420–72450. doi: 10.1109/ACCESS.2021.3079121

TABLE 7. Data and ML Methods Used in Studies on COVID-19 and Air Quality.

Author Data and sources ML methods
[70] COVID-19 death cases from US Facts, Pollutants data (e.g., Inline graphic, benzene, formaldehyde, acetaldehyde, carbon tetrachloride) from Environmental Protection Agency and Centers for Disease Control and Prevention, weather (e.g., temperature, precipitation, sunlight and UV exposure), land cover, health status (e.g., disabled, obese, overweight) from Centers for Disease Control and Prevention, socio-economics (e.g., health insurance, poverty, income) and commuting information (e.g., travel modes, time) from the US Census Geographical weighted RF (GW-RF)
[71] COVID-19 epidemiology data from [128] and New York Times COVID-19, Daily air traffic (people/day) from International Air Transport Association. Susceptible-exposed-infectious-recovered (SEIR) models, Bayesian Interference models
[83] Flight data from Bureau of Transportation Statistics, ground traffic data from NYC Open Data, air pollutant data (e.g., CO, Inline graphic, Ozone, and Inline graphic) from Aura Satellite (OMI instrument) and Environmental Protection Agency Support vector machine (SVM)
[91] Inline graphic, Inline graphic, and Inline graphic concentrations from the Secretary of the Environment of the Municipality of the Metropolitan District of Quito Parametric analysis
[96] Ground-based Inline graphic concentration from central pollution control board (CPCB), satellite-derived MODIS Aerosol Optical Depth (AODs) data, and meteorological data (wind speed, temperature, rainfall, relative humidity, and mixing height) from Indian Meteorological Department. Artificial neural network (ANN)
[97] Meteorological data (e.g., temperature, precipitation, wind speed, wind direction, and air pressure), air quality data for the years 2014–2020, and lockdown data from the Austrian government Principal components analysis (PCA), random forest (RF)
[98] Eight meteorological parameters (e.g., surface wind components, surface temperature, relative humidity, cloud coverage, precipitation, pressure, and planetary boundary layer) from government agencies Extreme Gradient Boosting Decision Tree (XGBDT)
[93] Daily minimum and maximum temperature, average wind speed and direction, average relative humidity, daily cumulative precipitation, and Inline graphic and Inline graphic concentration from ARPA Lombardia, and time and seasonal variables XGBDT
[99] Meteorological data (e.g., temperature, pressure, wind speed, cloud cover, solar radiation, ultra-violate radiation) from ERA5 reanalysis dataset, and Inline graphic data from the Earth Sciences Department of the Barcelona Supercomputing Center. Gradient Boosting Decision Tree (GBDT)
[100] CO, Inline graphic, Inline graphic, Inline graphic, Inline graphic, and Inline graphic data from six monitoring stations between March and April Reduced-spaced Gaussian Process Regression and ANN
[101] Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, Inline graphic, Toluene, benzene, and NH3 data from the Central Pollution Control Board and Ministry of Health and Family Welfare (MoHFW) Decision tree (DT), RF
[102] Inline graphic data from monitoring station of the US Embassy, Dhaka, Inline graphic, Inline graphic, CO, and Inline graphic from AirNow, Inline graphic measured by the Copernicus Sentinel-5 Precursor Tropospheric Monitoring Instrument Generalized additive models (GAMs), wavelet coherence, RF
[103] Daily COVID-19 cases and lockdown level from Statistical Portal of São Paulo State, and meteorological variables (e.g., relative humidity, maximum temperature, atmospheric pressure, wind speed, and global solar radiation), CO, Inline graphic, Inline graphic, NO, Inline graphic, and Inline graphic from Environmental Company of São Paulo State database (CETESB) ANN models (Multilayer Perceptron overview, Radial basis function, Extreme Learning Machines, Echo State Networks)
[104] The daily average of Inline graphic and Inline graphic from environmental monitoring stations located in the cities, COVID-19 death, resuscitations, and hospitalization from the French National Public Health Agency ANN
[105] Italian Civil Protection, Regional Environmental Protection Agencies (ARPA) SVM, K-Nearest Neighbor (KNN), GBDT, Classification and regression tree (CART), RF, Multilayer perceptron (MLP), Ada boosting with decision tree (AdaBoost), Extra tree (ET)
[106] Nine observation sites in Hangzhou, China RF
[107] City-level hourly data of 4 pollutants from Qingyue Open Environmental Data Center, meteorological data (e.g., temperature, relative humidity, wind direction, wind speed, and air pressure) from “worldmet” R package RF, New augmented synthetic control method
[108] COVID-19 positivity, mortality, and total case count from Italian Civil Protection, air pollutants (i.e., Inline graphic, Inline graphic, Inline graphic, Inline graphic, CO, Benzene, and Inline graphic) from Italian Ministry of Agriculture, Food and Forestry and Regional Environmental Protection Agency (ARPA), air pollution from Italian National Institute of Statistics RF
[109] Emission factors from different sources (open access and near-real-time measured activity data, proxy indicators and other available reports), stringency index from Oxford COVID-19 Government Response Tracker (OxCGRT) GBDT
[110] Pollutant (Inline graphic, Inline graphic, and Inline graphic) emission from Open Government Data (OGD) and World Bank, per capita GDP from Federal Reserve Bank of St. Louis ML-based complex causality algorithm (D2C)
[111] Pollutants (e.g., Inline graphic, CO, and Inline graphic) data were captured by MONICA (a cooperative air quality monitoring station) and transmitted via a Bluetooth serial interface to a Raspberry Pi Mod. 3 + based datasink with Raspbian OS Shallow neural network (SNN)
[112] Pollutants (e.g., Inline graphic, Inline graphic, Inline graphic, and CO) data from China National Environmental Monitoring Center, meteorological data (e.g., wind direction and speed, temperature, relative humidity, and pressure) from NOAA Integrated Surface Database RF
[113] Inline graphic data from operational Copernicus Sentinel 5 Precursor (S5P) TROPOMI, CAMS regional air quality models, European Centre for Medium-range Weather Forecasts (ECMWF) GBDT