Abstract
The study of the impacts of air pollution on COVID-19 has gained increasing attention. However, most of the existing studies are based on a single country, with a high degree of variation in the results reported in different papers. We attempt to inform the debate about the long-term effects of air pollution on COVID-19 by conducting a multi-country analysis using a spatial ecological design, including Canada, Italy, England and the United States. The model allows the residual spatial autocorrelation after accounting for covariates. It is concluded that the effects of PM2.5 and NO are inconsistent across countries. Specifically, NO was not found to be an important factor affecting COVID-19 infection, while a large effect for PM2.5 in the US is not found in the other three countries. The Population Attributable Fraction for COVID-19 incidence ranges from 3.4% in Canada to 45.9% in Italy, although with considerable uncertainty in these estimates.
Keywords: Coronavirus disease, Spatial model, Epidemiology, Multi-country, INLA
1. Introduction
The current outbreak of coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), led to 63 million cases and 1.4 million deaths worldwide by 30th Nov, 2020. Governments around the world imposed travel and business restrictions to slow the spread of COVID-19, which has impacted almost every part of life all over the world. The pandemic-imposed lockdown has had positive effects on air quality, as shown by AEA (2011) that the global energy-related carbon dioxide (CO) emissions fell by 5.8%. Similarly, Gautam (2020) compared the air quality data before and during the novel coronavirus pandemic, suggesting a significant reduction in the level of nitrogen dioxide (NO) in Asian and European countries due to COVID-19 lockdowns. However, since the long-term adverse effects of air pollution on human health (particularly respiratory and lung diseases) have been well documented (e.g., Pope et al., 2002, Janes et al., 2007, Haining et al., 2010, Huang et al., 2018), questions have been raised about its chronic exposure effects on COVID-19 death/incidence rates. As a result, there has been a steady growth in the investigation of air pollution impacts on COVID-19 in literature, with examples including Gupta et al., 2021, Gautam et al., 2021, Pozzer et al., 2020, and Wu et al. (2020).
A recent systematic review by Copat et al. (2020) pointed out the important contribution of long-term exposure to PM2.5 (atmospheric particulate matter that has a diameter of less than 2.5 m) and NO2 on COVID-19 spread and lethality. Additionally, a large study by Pozzer et al. (2020) estimated that particulate air pollution contributed 15% (95% confidence interval 7%, 33%) to COVID-19 mortality worldwide and found that a significant fraction was attributable to anthropogenic sources, of which 50%–60% was related to fossil fuel. However, there is a high degree of variation in the literature so far regarding the effects of air pollution on COVID-19. For example, Andree (2020) investigated the relationship between PM2.5 and COVID-19 in the Netherlands and found that expected COVID-19 cases increased by almost 100% when pollution concentration increased by 20% (2 ), while a study by Cole et al. (2020) found that an increase of 1 in PM2.5 increased COVID-19 cases by about 7% in Dutch municipalities.
Rather than studying a single county, in this study we attempt to inform the debate about the long-term effects of air pollution on COVID-19 outcomes by looking at COVID-19 cases across different countries, namely Canada, Italy, England and the United States. We focus on reported COVID-19 cases because: (1) most existing studies investigated the air pollution effects on COVID-19 mortality, while there is only limited research focusing on the infection risk; (2) our hypothesis is that people who have been long-term exposed to high pollution could have reduced natural defences to respiratory infections; this in turn might make it easier for SARS-CoV-2 to bring on COVID-19 infections and spread in the population. Specifically, for each country, we investigate whether long-term average exposure to air pollution (PM2.5 and NO) increases the risk of COVID-19 infection, by comparing geographical contrasts in air pollution and incidence rates across small areas while accounting for potential confounders, including lung cancer incidence (a proxy for smoking), unemployment rate and proportions of visible or ethnic minorities.
The remainder of this paper is organized as follows. The data and its exploratory analysis are presented in Section 2, while the statistical methodology is outlined in Section 3. The results of the study are reported in Section 4, and the key conclusions are presented in Section 5.
2. Data
The data used in this study include COVID-19 cases, ambient air pollution concentrations and socio-economic confounders for four countries; a brief summary can be seen in Table 1. For each country, data were collected based on non-overlapping areal units which are 93 health regions, 107 provinces, 149 upper tier local authorities and 3108 counties for Canada, Italy, England and the mainland United States, respectively. Data sources and description are summarized in Appendix. The population of the mainland United States is 314.4 million which is about 9 times that of Canada and 5 times that of Italy or England. Although its area is slightly larger than the US, Canada has a much smaller number of areal units in this study. The median population for the areal units are 196, 386, 277 and 25 thousand for Canada, Italy, England and the mainland United States, respectively. There are 219 counties in the United States having population greater than 300,000. More details are given in Table 1.
Table 1.
Data from Canada, Italy, England and mainland United States, with population in million and COVID-19 cases in thousand (up to 16th Oct, 2020). Covariate summary is shown by its 0.25 and 0.75 quantiles, with units for PM2.5, ppb for NO, % for unemployment rate and visible minorities, cases per 100,000 for lung cancer mortality. In addition, in Italy the index of material and social vulnerability was used in place of the unemployment.
Canada | Italy | England | United States | |
---|---|---|---|---|
Population | 36.3 | 60.2 | 55.5 | 314.4 |
Areal units | 93 | 107 | 149 | 3108 |
Total cases | 194 | 373 | 560 | 7784 |
PM2.5 | (5.37, 7.38) | (11.15, 16.12) | (7.59, 9.60) | (4.40, 6.90) |
NO | (2.52, 7.25) | (4.65, 9.69) | (8.00, 15.28) | (2.68, 4.58) |
Unemployment | (6.70, 10.00) | (98.07, 99.74) | (1.40, 2.60) | (3.10, 4.80) |
Visible minorities | (1.60, 6.70) | (4.18, 8.22) | (4.00, 25.50) | (4.47, 19.87) |
Lung cancer | (62.65, 79.33) | (47.95, 61.15) | (85.10, 125.80) | (40.70, 57.50) |
2.1. COVID-19 data
For each country, the disease data comprise of the total confirmed COVID-19 cases in each areal unit (upper tier local authority for England, province for Italy, health region for Canada and county for the US) up to 16th Oct, 2020. We denote Y as the reported number of COVID-19 cases for the k areal unit. As the number of case in an areal unit depends on its population, we calculate the expected number of cases in each areal unit () based on national incidence rate in sex and age groups by (3).
2.2. Air pollution data
The most recently available 3-year data are used to represent the long-term areal unit level pollution concentrations, that is 2014–2016 for Canada and 2016–2018 for the other three countries (2017–2019 for US NO).
In this study we consider PM2.5 and NO coming from two data sources (see Table A.1): measurements from a monitoring network (Canada, Italy, England, US NO) and gridded modelled output (US PM2.5 and Canada). We based our analysis on directly measured monitoring data wherever possible. For Italy, England and 83 of the 93 regions in Canada, the air pollution monitoring networks are sufficiently dense, therefore simple averages of monitoring data from stations within 5 km buffer of each areal unit were computed.
Table A.1.
Data sources in this study.
The monitoring network in North America does not cover a large proportion of the counties in the US and a few more remote regions in Canada, and publicly available gridded data were used in these instances. For Canada, both PM2.5 and NO gridded data are available for recent years, and simple averages of values in grid cells in each of the remote region were used as exposures. For the US, gridded PM2.5 data for recent years are available from Group (2021) and again averages within each county were computed. Gridded NO data for the US was not available for recent years at the time of writing, and we created our own spatial interpolation of ground monitors as described in Section 3.1.
Maps of PM2.5 and NO for each country are shown in Fig. A.1, Fig. A.2, Fig. A.3, Fig. A.4, while Fig. A.5 presents their distributions across areal units. The latter shows that the long-term averaged PM2.5 levels in Italy are much higher than the others, while the long-term averaged NO levels in England are more uniformly distributed across areal units.
Fig. A.1.
Canada data, including PM2.5 (), NO (ppb), Lung cancer (per 100,000), Ethnicity (%) and Unemployment (%).
Fig. A.2.
United States data, including PM2.5 (), NO (ppb), Lung cancer (per 100,000), Ethnicity (%) and Unemployment (%).
Fig. A.3.
Italy data, including PM2.5 (), NO (ppb), Lung cancer (per 100,000), Ethnicity (%) and Vulnerability.
Fig. A.4.
England data, including PM2.5 (), NO (ppb), Lung cancer (per 100,000), Ethnicity (%) and Unemployment (%).
Fig. A.5.
Empirical distributions of covariates, Canada (red), Italy (blue), England (green), United States (purple).
2.3. Socio-economic confounders
Confounders included in the model as areal-level covariates are: (i) the incidence of lung cancer mortality (cases per 100,000 people) as a proxy for smoking prevalence; (ii) unemployment rate as a proxy for socio-economic deprivation; and (iii) the proportion of the population who belong to minority groups. The minority groups differed across countries as reflected in the data available. In Canada the group used is visible minorities, as defined by the Census. For Italy we used individuals from Africa or Asia, the black and minority ethnic population was used for England, and the non-white population used in the US. In addition, in Italy the index of material and social vulnerability was used in place of the unemployment, the former believed to be a better proxy of socio-economic deprivation in the Italian context. The maps of confounders for each country are shown in Fig. A.1, Fig. A.2, Fig. A.3, Fig. A.4, while Fig. A.5 presents their distributions across areal units.
3. Methods
3.1. Air pollution interpolation
A geostatistical model was fit to NO monitoring data (2017–2019) from 517 ground stations in the US, creating spatial prediction (or Kriged) estimates on a fine grid covering the lower-48 states. Writing as the measurement from the th ground station, the model used is
(1) |
where is a Gaussian random field with spatial correlation function and is the separation vector. The spatial covariates include an intercept, gridded PM2.5 data, and gridded NO data from the most recent year 2011. A Matérn correlation function with shape parameter 1 was used for . Maximum Likelihood Estimates of model parameters were obtained and spatial predictions were created using the R package geostatsp (Brown, 2015).
3.2. COVID-19 incidence model
We use a spatial ecological design to estimate the impact of air pollution on counts of COVID-19 cases in small areas. Writing , and as the observed count, age–sex adjusted expected count, and relative risk for area , the model is given by
(2) |
where indicates the regions which share a common border with and (the number of such regions). Expected counts are calculated as
(3) |
where is the population of the th age–sex group in region and are age–sex specific incidence rates estimated as the total observed cases in group divided by the total population for group .
The spatial random effect follows a Leroux et al. (2000) model, with a spatial dependence parameter and variance parameter . When the are independent and Normally distributed with variance , and the surface will be rough in space with each region being unrelated to regions nearby. As increases the value of will become progressively more dependent on the mean of the in neighbouring regions, and the surface becomes smoother. Setting gives the standard spatial autoregressive model.
The explanatory variables in are ambient air pollutant concentrations and the confounders presented in the previous section. One interpretation of is it accounts for unobserved confounders not included in the model, and we would expect for area if all the important risk factors were included in (and these effects were linear). The regression parameters are assigned weakly informative zero-mean Gaussian priors with a large diagonal variance matrix .
The prior distributions for the spatial parameters are
(4) |
The Exponential prior distribution of allows for, and encourages, small values and consequently a flat surface. This would be expected if the covariates in were all correctly identified and accurately measured. The prior variance of for was chosen to make the prior for reasonably uniform but discouraging the most extreme values.
The spatial models were implemented in the INLA software (Rue et al., 2009) and the Leroux CAR prior distribution for spatially structured random effects is available in R-INLA using the “besagproper2” model. An alternative implementation of this prior distribution was first introduced in Ugarte et al. (2014) using the “generic1” model of R-INLA. The INLA, or Integrated Nested Laplace Approximation, methodology uses a computationally effective and extremely powerful alternative to implement Bayesian models, and is an increasingly popular analysis package in R. For any missing covariates, we treat the corresponding response as missing before fitting the INLA model. For details on how to fit spatial and spatio-temporal models with R and INLA, refer to Blangiardo et al. (2013).
3.3. Population attributable fraction
The population attributable fraction (or PAF, see Mansournia and Altman, 2018) provides a simple way of quantifying the combined impact of the two pollutants on COVID-19 incidence. For a given country, the PAF is defined as
(5) |
where is the vector of covariates for region where the values for pollutants have been replaced by reference values common across regions. Here the reference values are the 25% percentiles of PM2.5 (4.53 ) and NO (2.74 ppb) observations across countries. By fitting model (2), the posterior samples for and are obtained, which then are used by (5) to obtain a posterior distribution of PAF to retain and propagate the uncertainty from the health model (2).
4. Results
4.1. Spatial distribution
In order to show comparable results across countries and small areas, we calculate the predicted incidence rate per 100,000 people, standardized to the EU standard population from Eurostat (2020). For area , that is calculated by
where is the age–sex incidence rate from (3) and is the EU standard population for age–sex group . Posterior means for each country are shown on the left panels of Fig. 1. The right panels of Fig. 1 show the probabilities of 50% excess risk Pr, which conveys some of the uncertainty in the estimates of .
Fig. 1.
Incidence rate per 100,000 people , standardized to the EU standard population and probabilities of 50% excess risk Pr for the four countries in the study.
The predicted standardized incidence rate per 100,000 people in the US are much higher than the other countries, and some counties are as high as almost 25,000, or a quarter of the population. In contrast, its neighbour Canada has been more successful in controlling COVID-19, with the highest standardized incidence rate being about only 1500. Northern Italy has fairly high incidence, with the standardized incidence rate in northern provinces being about 3000 with high probability. Similarly, the north west of England has high exceedance probability, with the standardized incidence rate in the upper tier local authorities being about 5000.
4.2. Air pollution effects
The main results of this study are presented in Table 2, which shows the relative risk based on a single unit increase in air pollution along with the spatial parameters and population attributable fraction. Note that the effects of PM2.5 are inconsistent across countries. We found some evidence of a protective effect of long-term exposure to PM2.5 in Canada, but with large uncertainty, while a harmful effect is present in the US, with more modest effects in Italy and England. The lower bound of the estimated US effect is well above the upper bounds for Canada, Italy and England. The NO effects are more consistent, albeit very modest.
Table 2.
Estimates and 95% intervals for effect sizes, spatial parameters and population-attributable fraction of incidence. Effects for PM2.5, NO, lung cancer incidence, percent unemployed, and percent ethnic minorities are percent increase in relative risk for a one unit increase, and a value of 1.2 corresponds to a regression coefficient of .
Canada |
Italy |
England |
United States |
|||||
---|---|---|---|---|---|---|---|---|
Est | CI | Est | CI | Est | CI | Est | CI | |
Relative risk | ||||||||
NO | 5.0 | (−3.8, 14.7) | 2.7 | (−0.4, 6.0) | 0.5 | (−0.5, 1.6) | 2.3 | (−3.2, 8.1) |
PM2.5 | −10.9 | (−23.9, 4.5) | 0.5 | (−2.5, 3.5) | 2.9 | (−0.7, 6.6) | 12.6 | (7.4, 18.1) |
Lung cancer | −0.2 | (−2.3, 2.0) | 0.0 | (−0.9, 0.9) | 0.7 | (0.5, 1.0) | 0.0 | (−0.2, 0.2) |
Unemployment | −19.0 | (−25.1,−12.4) | −7.4 | (−20.8, 8.6) | 4.5 | (−3.0, 12.7) | −4.3 | (−6.0, −2.5) |
Visible minorities | 2.8 | (0.5, 5.1) | 2.3 | (−0.2, 5.0) | 1.1 | (0.7, 1.5) | 1.3 | (1.1, 1.5) |
Spatial parameters | ||||||||
Std deviation | 1.1 | (0.7, 1.6) | 0.4 | (0.3, 0.5) | 0.2 | (0.1, 0.2) | 0.8 | (0.8, 0.9) |
Dependence | 0.6 | (0.4, 0.8) | 0.8 | (0.6, 0.9) | 0.9 | (0.8, 1.0) | 1.0 | (1.0, 1.0) |
Population attributable fraction | ||||||||
PAF | 3.4 | (−59.9, 39.6) | 45.9 | (11.1, 66.3) | 10.3 | (1.6, 19.0) | 26.1 | (21.5, 30.2) |
A 1% increase in visible minorities in a region increases COVID-19 incidence rate by 1.1%–2.8%. This adverse association is consistent with the results from the National Urban League (2020), which found the infection rates for black and Latino Americans are 62 and 73 per 10,000, respectively, compared with 23 per 10,000 for whites. As stated in National Urban League (2020), black and Latino Americans are more likely to live in crowded housing conditions and to work in essential jobs which cannot be performed from home. In addition, Table 2 also suggests that the lung cancer mortality incidence rate was adversely associated with COVID-19 incidence in England, while the unemployment is an important confounder in Canada and the US.
The population attributable fractions from each country are shown in the last row of Table 2. The results suggest that if the long-term average PM2.5 drops to 4.53 and NO drops to 2.74 ppb, the COVID-19 cases are likely to reduce 26.1% (95% CI 21.5%, 30.2%) for the US, 45.9% (95% CI 11.1%, 66.3%) for Italy and 10.3% (95% CI 1.6%, 19%) for England. The PAF from Canada study is neglectful with a wide credible interval, which is likely because of the estimate of PM2.5. In addition, Table 2 shows that the PAF from the US is more certain than the other countries, as its 95% CI is much narrower, while the one from Italy is less certain with a very wide 95% CI. The highest magnitude estimate from Italy is mainly because its long-term PM2.5 and NO levels (see Fig. A.5) are generally much higher than the PAF baseline levels.
Fig. 2 shows both the prior and the posterior distributions of the spatial dependence parameter and standard deviation parameter from fitting the models in (2) to data from each country. Both parameters are well identified in all countries, although posterior distributions are narrowest for the US and its 3108 counties and widest for Canada’s 93 health regions. The posteriors for have little mass near zero, justifying the spatial random effect in the health model (2).
Fig. 2.
Prior (red) and Posterior (black) for spatial dependence parameter and spatial standard deviation parameter .
Table 2 also shows that the spatial dependence increases with the number of study areal units. For example, the estimated spatial parameter is 0.6 for Canada while it is 1 for the US study. Higher value of indicates a strong residual spatial autocorrelation after accounting for the known covariates.
For validation purposes, we ran an alternative commonly used convolution model proposed by Besag et al. (1991) with the parameterization from Riebler et al. (2016). We also performed a sensitivity analysis on the choice of prior distribution of model hyperparameters in the currently used Leroux CAR model. We considered exponential priors with and . For , we specified and to represent either left or right shifted distributed of the prior, rather than the currently used ‘rough uniform’ distributed prior. The comparison results are presented in Table A.2, showing that the inferences are consistent across these various health models, which enhances the validity of the results.
Table A.2.
Prior sensitivity analysis, with estimates and 95% intervals for effect sizes with the prior from the manuscript, four additional priors, and an analysis with the convolution model parameterization of the spatial random effect from Riebler et al. (2016).
Canada |
Italy |
England |
United States |
|||||
---|---|---|---|---|---|---|---|---|
Est | CI | Est | CI | Est | CI | Est | CI | |
Prior from manuscript, | ||||||||
NO | 5.0 | (−3.8, 14.7) | 2.7 | (−0.4, 6.0) | 0.5 | (−0.5, 1.6) | 2.3 | (−3.2, 8.1) |
PM2.5 | −10.9 | (−23.9, 4.5) | 0.5 | (−2.5, 3.5) | 2.9 | (−0.7, 6.6) | 12.6 | (7.4, 18.1) |
Lung cancer | −0.2 | (−2.3, 2.0) | 0.0 | (−0.9, 0.9) | 0.7 | (0.5, 1.0) | 0.0 | (−0.2, 0.2) |
Unemployment | −19.0 | (−25.1,−12.4) | −7.4 | (−20.8, 8.6) | 4.5 | (−3.0, 12.7) | −4.3 | (−6.0, −2.5) |
Visible minorities | 2.8 | (0.5, 5.1) | 2.3 | (−0.2, 5.0) | 1.1 | (0.7, 1.5) | 1.3 | (1.1, 1.5) |
, | ||||||||
NO | 4.9 | (−3.6, 14.2) | 2.5 | (−0.5, 5.7) | 0.5 | (−0.4, 1.6) | 2.5 | (−3.1, 8.4) |
PM2.5 | −11.8 | (−24.4, 2.9) | 0.4 | (−2.5, 3.5) | 3.0 | (−0.5, 6.7) | 12.5 | (7.3, 18.0) |
Lung cancer | 0 | (−2.1, 2.1) | −0.1 | (−1.0, 0.7) | 0.7 | (0.5, 1.0) | 0 | (−0.2, 0.2) |
Unemployment | −18.7 | (−24.9, −12.1) | −4.2 | (−18.0, 11.6) | 4.2 | (−3.2, 12.2) | −4.2 | (−6.0, −2.5) |
Visible minorities | 2.9 | (0.7, 5.1) | 2.2 | (−0.3, 4.8) | 1.1 | (0.7, 1.6) | 1.3 | (1.1, 1.5) |
, | ||||||||
NO | 5.1 | (−3.9, 15.2) | 3.0 | (−0.2, 6.3) | 0.5 | (−0.5, 1.6) | 2.1 | (−3.4, 7.9) |
PM2.5 | −9.7 | (−23.1, 6.2) | 0.5 | (−2.6, 3.6) | 2.8 | (−0.9, 6.5) | 12.8 | (7.5, 18.2) |
Lung cancer | −0.3 | (−2.5, 1.9) | 0.1 | (−0.8, 1.0) | 0.7 | (0.5, 1.0) | 0.0 | (−0.2, 0.2) |
Unemployment | −19.2 | (−25.1, −12.8) | −11.0 | (−23.5, 4.4) | 4.8 | (−2.8, 13.3) | −4.3 | (−6.0, −2.5) |
Visible minorities | 2.7 | (0.4, 5.0) | 2.5 | (−0.1, 5.2) | 1.1 | (0.6, 1.5) | 1.3 | (1.1, 1.5) |
, | ||||||||
NO | 4.9 | (−3.7, 14.4) | 2.5 | (−0.5, 5.7) | 0.5 | (−0.4, 1.6) | 2.5 | (−3.1, 8.3) |
PM2.5 | −11.8 | (−24.6, 3.2) | 0.4 | (−2.5, 3.5) | 3.0 | (−0.5, 6.7) | 12.5 | (7.3, 18.0) |
Lung cancer | 0.0 | (−2.1, 2.1) | −0.1 | (−1.0, 0.8) | 0.7 | (0.5, 1.0) | 0.0 | (−0.2, 0.2) |
Unemployment | −18.7 | (−25.0, −12.0) | −4.2 | (−18.0, 11.6) | 4.2 | (−3.2, 12.3) | −4.2 | (−6.0, −2.5) |
Visible minorities | 2.9 | (0.6, 5.1) | 2.2 | (−0.3, 4.8) | 1.1 | (0.7, 1.6) | 1.3 | (1.1, 1.5) |
, | ||||||||
NO | 5.2 | (−3.9, 15.2) | 3.0 | (−0.2, 6.3) | 0.5 | (−0.5, 1.6) | 2.2 | (−3.3, 8.0) |
PM2.5 | −9.9 | (−23.3, 6.0) | 0.5 | (−2.6, 3.6) | 2.8 | (−0.9, 6.5) | 12.7 | (7.5, 18.2) |
Lung cancer | −0.3 | (−2.4, 1.9) | 0.1 | (−0.8, 1.0) | 0.7 | (0.5, 1.0) | 0.0 | (−0.2, 0.2) |
Unemployment | −19.2 | (−25.2, −12.7) | −10.9 | (−23.7, 4.9) | 4.8 | (−2.8, 13.3) | −4.3 | (−6.0, −2.5) |
Visible minorities | 2.7 | (0.4, 5.0) | 2.5 | (−0.1, 5.2) | 1.1 | (0.6, 1.5) | 1.3 | (1.1, 1.5) |
Convolution model parameterization | ||||||||
NO | 3.9 | (−6.0, 15.2) | 3.9 | (0.7, 7.3) | 0.6 | (−0.4, 1.7) | 3.2 | (−2.3, 8.9) |
PM2.5 | −8.6 | (−23.7, 9.2) | 1.9 | (−1.2, 4.9) | 3.3 | (−0.3, 7.0) | 12.0 | (6.9, 17.4) |
Lung cancer | 0.1 | (−2.2, 2.5) | 0.0 | (−0.9, 0.9) | 0.7 | (0.4, 0.9) | 0.0 | (−0.2, 0.2) |
Unemployment | −20.6 | (−26.9, −14.0) | −17.0 | (−24.4, −8.6) | 3.1 | (−4.3, 10.9) | −4.1 | (−5.8, −2.4) |
Visible minorities | 2.3 | (−0.2, 4.9) | 3.2 | (0.5, 6.0) | 1.2 | (0.8, 1.7) | 1.3 | (1.1, 1.5) |
5. Conclusion and discussion
Ogen (2020) reasons that a body experiencing chronic respiratory stress due to air pollution has a diminished ability to defend itself from infections. It is therefore a plausible hypothesis that long-term exposure to air pollution makes people more vulnerable to contracting COVID-19. We attempt to inform the debate about the long-term effects of air pollution on COVID-19 by conducting a multi-country analysis, including Canada, Italy, England and the US.
The analysis presented in this paper has found that the inferred relationship between long-term ambient exposures to PM2.5 and NO and COVID-19 incidence is inconsistent across the four countries examined. The results show that the relationship between air pollution exposure and COVID-19 incidence rate vary across countries, and a 1 increase in long-term exposure to PM2.5 increases the COVID-19 incidence rate by 12.6% (95% CI 7.4%, 18.1%) in the US, after adjusting for confounding and spatial autocorrelation, while the effects in Italy and England are more modest at 0.5% (95% CI −2.5%, 3.5%) and 2.9% (95% CI −0.7%, 6.6%), respectively, and in Canada the data provide some evidence of a protective effect of PM2.5, albeit characterized by large uncertainty. This protective effect with a wide credible interval is likely caused by spatial confounding issues between the spatial random effect and the explanatory variables included in the model, since the spatial confounding occurs when the covariates have a spatial pattern and are collinear with the spatial random effects (see, e.g., Adin et al., 2021, Hodges and Reich, 2010).
Our finding of the important impact of PM2.5 on COVID-19 in the US is consistent with a recent study by Wu et al. (2020), where they found that an increase of 1 in the long-term average PM2.5 was associated with an 11% (95% confidence interval 6%, 17%) increase in the COVID-19 mortality rate in the US. The finding of the modest (not significant) effects of NO and PM2.5 in England tells the same story with the research by Konstantinoudis et al. (2021), where they only found some evidence of an effect for NO2 and large uncertainty for PM2.5. Similarly, an analysis performed by the UK’s Office for National Statistics (2020) found that long-term exposure to PM2.5 could increase the risk of contracting and dying from COVID-19 by up to 7%, however when controlling for ethnicity, air pollution exposure had no statistically significant impact on COVID-19 deaths.
There are various reasons why COVID-19 prevalence varies across countries, and why socio-economic variables such as unemployment and ethnicity will influence COVID-19 prevalence more strongly in some countries than others. The biological hypothesis that air quality affects COVID-19 incidence through respiratory stress should, however, correspond to relative risks per unit increase being similar across countries. This lack of reproducibility should at a minimum lead the scientific community to view with some degree of caution the substantial relationships between air quality and COVID-19 reported by some studies. The population attributable fraction calculations in Table 2 estimate that in the US a quarter of cases can be ascribed to poor air quality, and the statistical error associated with this estimate is slight (95% CI 21%, 30%). If we were to take this result at face value, we might conclude that lockdown measures can be relaxed in areas where air quality is high. The authors of this paper would not recommend such an action, as consistent and reproducible evidence supporting it is lacking.
One explanation for the inconsistent effects is the inherent limits of ecological studies where covariates refer to area-level, rather than individual-level, characteristics. Area-level measures are not always reliable proxies for individual-level exposures, individuals in a region are heterogeneous and the ecological fallacy can occur when there are complex dependencies and interactions at work at the individual-level. The relationship between individual-level and area-level exposures, and the measured values of the latter, could well be different in the four countries. The 3108 counties in the US provide higher-resolution spatial information than the 93–149 regions in the other countries, it is possible that the higher PM2.5 effects in the US result from a more accurate exposure assessment and less attenuation of effects due to averaging.
Another explanation for the inconsistent results could be unmeasured confounders which are correlated with air quality to different degrees across countries. For example, PM2.5 might be more strongly correlated with mobility and long-distance commerce in the more car-dependent United States than the other three countries. If mobility and commerce are in turn strong predictors of COVID-19 prevalence, failure to adjust for them as model covariates could cause a stronger inferred COVID-19/PM2.5 relationship in the US than elsewhere. Access to testing, adherence to control measures, and individuals in high-risk occupations might also be more strongly correlated with PM2.5 in the US. Inferring causality from observational data is challenging and not always possible, and spatial observational data is particularly problematic (see Reich et al., 2020 for example). The lack of consistency in estimated PM2.5 effect sizes across countries suggests that the inferred relationships are not causal and additional unmeasured (and possibly individual-level) confounders are influencing the results.
This analysis has demonstrated the importance of replicating spatial analyses across multiple countries whenever possible. Spatial data on health outcomes and exposures is becoming increasingly available and the tools available for manipulating and managing these data make data acquisition a much simpler task than was the case previously. In the model building phase of an analysis, it is inevitable that results from several subtly different model formulations with different confounders will be known before a ‘final’ methodology is decided upon. Analysing several datasets with methodologies which are as close to identical as possible will help guard against overfitting and possible inherent bias towards positive results.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Footnotes
Acknowledgements: P. E. Brown is funded by the Natural Sciences and Engineering Research Council of Canada .
Appendix.
The data sources used in this study are presented in Table A.1, while the results of prior sensitivity analysis of disease model are shown in Table A.2. The data description are shown as Fig. A.1, Fig. A.2, Fig. A.3, Fig. A.4, Fig. A.5.
References
- Adin A., Goicoa T., Hodges J.S., Schnell P.M., Ugarte M.D. Alleviating confounding in spatio-temporal areal models with an application on crimes against women in India. Stat. Model. 2021 1471082X2110154. [Google Scholar]
- AEA . 2011. UK modelling under the Air Quality Directive (2008/50/EC) for 2010 covering the following air quality pollutants: SO, NO, NO, PM, PM, lead, benezene, CO and ozone. http://uk-air.defra.gov.uk/assets/documents/reports/cat09/1204301513_AQD2010mapsrep_master_v0.pdf. [Google Scholar]
- Andree B.P.J. MedRxiv. Cold Spring Harbor Laboratory Press; 2020. Incidence of COVID-19 and connections with air pollution exposure: Evidence from the Netherlands. URL: https://www.medrxiv.org/content/early/2020/05/03/2020.04.27.20081562. [Google Scholar]
- Berry I., Soucy J.-P.R., Tuite A., Fisman D. Open access epidemiologic data and an interactive dashboard to monitor the COVID-19 outbreak in Canada. Canad. Med. Assoc. J. 2020;192(15):E420. doi: 10.1503/cmaj.75262. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Besag J., York J., Mollie A. BayesIan image restoration with two applications in spatial statistics. Ann. Inst. Statist. Math. 1991;43:1–59. [Google Scholar]
- Blangiardo M., Cameletti M., Baio G., Rue H. Spatial and spatio-temporal models with R-INLA. Spat. Spatio-Temporal Epidemiol. 2013;4:33–49. doi: 10.1016/j.sste.2012.12.001. [DOI] [PubMed] [Google Scholar]
- Brown P.E. Model-based geostatistics the easy way. J. Stat. Softw. 2015;63(12):1–24. [Google Scholar]
- Cole M.A., Ozgen C., Strobl E. Air Pollution Exposure and Covid-19 in Dutch Municipalities. Environ. Resour. Econ. 2020;76(4):581–610. doi: 10.1007/s10640-020-00491-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Copat C., Cristaldi A., Fiore M., Grasso A., Zuccarello P., Signorelli S., Conti G., Ferrante M. The role of air pollution (PM and NO2) in COVID-19 spread and lethality: A systematic review. Environ. Res. 2020;191 doi: 10.1016/j.envres.2020.110129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eurostat . 2020. EU standard population (based on the EU and EFTA 2011-30 population projections) URL: https://ec.europa.eu/eurostat/cache/metadata/Annexes/hlth_cdeath_esms_an1.pdf. [Google Scholar]
- Gautam S. COVID-19: air pollution remains low as people stay at home. Air Qual. Atmosp. Health. 2020;13 doi: 10.1007/s11869-020-00842-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gautam S., Samuel C., Gautam A.S., Kumar S. Strong link between coronavirus count and bad air: a case study of India. Environ. Dev. Sustainab. 2021 doi: 10.1007/s10668-021-01366-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Atmospheric Composition Analysis Group . 2021. Surface PM2.5. https://sites.wustl.edu/acag/datasets/surface-pm2-5/ [Google Scholar]
- Gupta A., Bherwani H., Gautam S., Anjum S., Musugu K., Kumar N., Anshul A., Kumar R. Air pollution aggravating COVID-19 lethality? Exploration in Asian cities using statistical models. Environ. Dev. Sustainab. 2021;23:6408–6417. doi: 10.1007/s10668-020-00878-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haining R., Li G., Maheswaran R., Blangiardo M., Law J., Best N., Richardson S. Inference from ecological models: estimating the relative risk of stroke from air pollution exposure using small area data. Spat. Spatio-Temporal Epidemiol. 2010;1(2-3):123–131. doi: 10.1016/j.sste.2010.03.006. [DOI] [PubMed] [Google Scholar]
- Hodges J.S., Reich B.J. Adding spatially-correlated errors can mess up the fixed effect you love. Amer. Statist. 2010;64(4):325–334. [Google Scholar]
- Huang G., Lee D., Scott E.M. Multivariate space–time modelling of multiple air pollutants and their health effects accounting for exposure uncertainty. Stat. Med. 2018;37(7):1134–1148. doi: 10.1002/sim.7570. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Janes H., Dominici F., Zeger S.L. Trends in air pollution and mortality: an approach to the assessment of unmeasured confounding. Epidemiology. 2007;18(4):416–423. doi: 10.1097/EDE.0b013e31806462e9. [DOI] [PubMed] [Google Scholar]
- Killeen B.D., Wu J.Y., Shah K., Zapaishchykova A., Nikutta P., Tamhane A., Chakraborty S., Wei J., Gao T., Thies M., Unberath M. 2020. A county-level dataset for informing the United States’ response to COVID-19. arXiv:2004.00756 [cs.CY] [Google Scholar]
- Konstantinoudis G., Padellini T., Bennett J., Davies B., Ezzati M., Blangiardo M. Long-term exposure to air-pollution and COVID-19 mortality in England: A hierarchical spatial analysis. Environ. Int. 2021;146:106316. doi: 10.1016/j.envint.2020.106316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leroux B., Lei X., Breslow N. Springer-Verlag; New York: 2000. Estimation of Disease Rates in Small Areas: A New Mixed Model for Spatial Dependence; pp. 179–191. Chap. Statistical Models in Epidemiology, the Environment and Clinical Trials, Halloran, M and Berry, D (eds) [Google Scholar]
- Lucas J. Scholars Portal Dataverse; 2020. COVID-19 Canada Health Regions Shapefile. (visited on 06/08/2021) [DOI] [Google Scholar]
- Mansournia M.A., Altman D.G. Population attributable fraction. BMJ. 2018;360 doi: 10.1136/bmj.k757. [DOI] [PubMed] [Google Scholar]
- National Urban League . A National Urban League Publication; 2020. 2020 State of Black America: Unmasked. URL: https://soba.iamempowered.com/?_ga=2.182044386.470730837.1597941243-1376958794.1597941243. [Google Scholar]
- Office for National Statistics . 2020. Does exposure to air pollution increase the risk of dying from the coronavirus (COVID-19)? [Google Scholar]
- Ogen Y. Assessing nitrogen dioxide (NO2) levels as a contributing factor to coronavirus (COVID-19) fatality. Sci. Total Environ. 2020;726 doi: 10.1016/j.scitotenv.2020.138605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pope I.C., Burnett R., Thun M., et al. Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. JAMA. 2002;287(9):1132–1141. doi: 10.1001/jama.287.9.1132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pozzer A., Dominici F., Haines A., Witt C., Munzel T., Lelieveld J. Regional and global contributions of air pollution to risk of death from COVID-19. Cardiovasc. Res. 2020;116:2247–2253. doi: 10.1093/cvr/cvaa288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reich B.J., Yang S., Guan Y., Giffin A.B., Miller M.J., Rappold A.G. 2020. A review of spatial causal inference methods for environmental and epidemiological applications. arXiv:2007.02714 [stat.ME] [DOI] [PMC free article] [PubMed] [Google Scholar]
- Riebler A., Sørbye S.H., Simpson D., Rue H. An intuitive Bayesian spatial model for disease mapping that accounts for scaling. Stat. Methods Med. Res. 2016;25(4):1145–1165. doi: 10.1177/0962280216660421. PMID: 27566770. [DOI] [PubMed] [Google Scholar]
- Rue H., Martino S., Chopin N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc. Ser. B Stat. Methodol. 2009;71(2):319–392. [Google Scholar]
- Ugarte M., Adin A., Goicoa T., Militino A. On fitting spatio-temporaldisease mapping models using approximate Bayesian inference. Stat. Methods Med. Res. 2014;23(6):507–530. doi: 10.1177/0962280214527528. [DOI] [PubMed] [Google Scholar]
- Wu X., Nethery R.C., Sabath M.B., Braun D., Dominici F. Air pollution and COVID-19 mortality in the United States: Strengths and limitations of an ecological regression analysis. Sci. Adv. 2020;6(45) doi: 10.1126/sciadv.abd4049. [DOI] [PMC free article] [PubMed] [Google Scholar]