Skip to main content
PLOS One logoLink to PLOS One
. 2021 May 27;16(5):e0237277. doi: 10.1371/journal.pone.0237277

Causal graph analysis of COVID-19 observational data in German districts reveals effects of determining factors on reported case numbers

Edgar Steiger 1,*, Tobias Mussgnug 1, Lars Eric Kroll 1
Editor: Sungwoo Lim2
PMCID: PMC8158986  PMID: 34043653

Abstract

Several determinants are suspected to be causal drivers for new cases of COVID-19 infection. Correcting for possible confounders, we estimated the effects of the most prominent determining factors on reported case numbers. To this end, we used a directed acyclic graph (DAG) as a graphical representation of the hypothesized causal effects of the determinants on new reported cases of COVID-19. Based on this, we computed valid adjustment sets of the possible confounding factors. We collected data for Germany from publicly available sources (e.g. Robert Koch Institute, Germany’s National Meteorological Service, Google) for 401 German districts over the period of 15 February to 8 July 2020, and estimated total causal effects based on our DAG analysis by negative binomial regression. Our analysis revealed favorable effects of increasing temperature, increased public mobility for essential shopping (grocery and pharmacy) or within residential areas, and awareness measured by COVID-19 burden, all of them reducing the outcome of newly reported COVID-19 cases. Conversely, we saw adverse effects leading to an increase in new COVID-19 cases for public mobility in retail and recreational areas or workplaces, awareness measured by searches for “corona” in Google, higher rainfall, and some socio-demographic factors. Non-pharmaceutical interventions were found to be effective in reducing case numbers. This comprehensive causal graph analysis of a variety of determinants affecting COVID-19 progression gives strong evidence for the driving forces of mobility, public awareness, and temperature, whose implications need to be taken into account for future decisions regarding pandemic management.

Introduction

As the COVID-19 pandemic progresses, research on mechanisms behind the transmission of SARS-CoV-2 shows conflicting evidence [13]. While effects of mobility have been extensively discussed, less is known on other factors such as changing awareness in the population [46] or the effects of temperature [79]. A limiting factor in many studies is the lack of a causal approach to assess the causal contributions of various factors [10]. This can lead to distorted estimates of the causal factors with observational data [1012].

With COVID-19, we find ourselves in a situation in which information on the causal contribution of various influencing factors in the population is urgently needed to inform politicians and health authorities. On the other hand, trials cannot be carried out for obvious ethical and legal reasons. Therefore, when assessing the effects of determinants of SARS-CoV-2 spread, special attention must be paid to strategies for the selection of confounding factors.

Another problem with assessing the effects of various determinants of SARS-CoV-2 spread is the heterogeneity of the countries and regions examined for example in the Johns Hopkins University (JHU) COVID-19 database [13]. The comparison of time series of case numbers from different countries and observational periods can be strongly distorted by different factors like testing capacities and regional variations.

Our objective is to provide valid estimates of the effects of the main drivers of the pandemic with a causal graph approach. We conducted a scoping review of the available studies regarding signaling pathways and determinants of the spread of SARS-CoV-2 infections and the reported new COVID-19 cases. Then we integrated the current findings into a directed acyclic graph for the progress of the pandemic at the regional level. Using the resulting model and the do-calculus we found identifiable effects without blocked causal paths whose effects can be analyzed with observational data [14]. We used regional time series data of all German districts (401) from various publicly available sources to analyze these questions on a regional level. Germany is a good choice in this regard, because it has ample data on contributing factors on the regional level and has had high testing and treatment capacities from early on in the pandemic.

Causal model

We used a directed acyclic graph (DAG) [11, 12] as a tool to analyze the causal relationships between several exposures and SARS-CoV-2 spread. To get an overview on published associations, a scoping review was conducted from 20th to 22nd of May 2020 within Pubmed and Google scholar. Restrictions were applied to English and German language and the publication date in the last one year. The following search terms were applied to abstracts and title in Pubmed (“COVID-19” OR “COVID19” OR “Corona” OR “Coronavirus” OR “SARS-CoV-2”) and connected separately in each case with the exposure variables (“mobility”, “public awareness”, “awareness”, “google trends”,“ambient temperature”, “temperature”). For “mobility”, we analyzed n = 8 studies, N = 103 were scanned in Pubmed, together with the first ten pages (100 results) in Google scholar (“awareness”/“public awareness”/“google trends” n = 9, N = 215; “temperature”/“ambient temperature” n = 16, N = 235). We integrated these findings where possible into the construction of our DAG, which can be seen in Fig 1.

Fig 1. DAG of determinants of reported COVID-19 cases on the district level.

Fig 1

Unobserved variables are light gray, variables marked with an asterisk (*) are confounded by weekday/holiday.

A number of studies report a strong association of mobility restrictions on the number of new COVID-19 cases: Restrictive measures (e.g. “stay-at-home” orders, travel bans, or school closures) are shown to possibly reduce the COVID-19 incidence [2, 1521]. However, some studies point out the combination of various non-pharmaceutical interventions (NPIs) is decisive to prevent new infections [22, 23].

Google Trends [24] data can be used as a tool to get insights into public interest (awareness) in the coronavirus disease. Several recent studies imply a connection of relative search volumes (RSV) indices and reported new COVID-19 cases [46, 2530]. Some search terms e.g. “COVID-19” or “coronavirus” predated newly infected cases/total number of cases by roughly 7 to 14 days for different countries [46, 26]. Additionally, we acknowledged that individual risk-aware behavior might be a reaction to the current COVID-19 burden (measured as reported cases at the day of exposure).

Mixed evidence is available regarding the effect of temperature: On the one hand several papers report an association between increase in temperature and decrease in newly infected COVID-19 cases [79, 3136]. On the other hand, also the opposite has been found [37, 38]. Some studies found no association at all [22, 3942]. It should be noted that few studies considered other confounding variables than meteorological ones (especially age and population density among others [22, 36, 39]). In addition, the transferability of results between different climate zones is questionable. To avoid possible bias caused by weather variables other than temperature, we included rain, wind, and humidity in our model.

When investigating causal determinants of SARS-CoV-2 infections, a number of confounders have to be considered. Well-known risk factors for SARS-CoV-2 as well as for other infections are demographic factors such as age, gender, socio-economic status (SES), population density, and foreign citizenship/ethnicity [13, 43, 44]. In Germany along with other countries (i.e. Brazil, USA, or the UK), populist parties or politicians and their electorate tend to be more sceptical about effects of containment measures than the other part of the electorate [45, 46]. Therefore we considered both “right-wing populist party votes” and “voter turnout” as possible confounders. Public health interventions were also taken into account (contact restrictions, school closures etc.), as their implementation showed strong correlations with controlling the spread of SARS-CoV-2 [22, 23, 47]. To avoid bias due to reporting delay of case numbers we had to include weekday and German holidays. We included some unobserved variables in our DAG (e.g. “Herd immunity”), too. Please note that “Exposure to SARS-CoV-2” is itself an unobserved variable: German case numbers are reported with delay after date of exposure and symptom onset. Exposure to the virus should not be confused with the formal exposure variables of the DAG.

Materials and methods

Data

We collected and aggregated data on reported COVID-19 cases, regional socio-demographic factors, weather, and general mobility on district and state level in Germany for the period of 15 February 2020 to 8 July 2020. Our observation period for the outcome consisted of all dates from 20 February 2020 to 8 July 2020 (T = 140), since we used a lag of 5 days for all confounders. We did not exclude any states or districts (K = 401). We analyzed the daily reported number of new cases as outcome (KT = 56 140 observations). The set of possible predictors was derived from our causal DAG (see Table 1 and Fig 1). Due to modelling and data limitations, some of the predictors were unobserved or were modelled as a construct consisting of several variables. For our causal graph analysis, we computed adjustment sets separately for all observed exposures within the DAG (if the respective exposure was identifiable within the DAG causal analysis framework).

Table 1. Observed model variables.

Variable Dynamics Level Type Unit/comment Source
Weekday daily national categorical Sat through Thu as six binary variables, Fri as baseline -
Holiday (report) daily national binary - -
Holiday (exposure) daily national binary - -
Mobility
 Retail and recreation daily state numeric percent change compared to reference period Google [49]
 Grocery and pharmacy daily state numeric percent change compared to reference period Google [49]
 Parks daily state numeric percent change compared to reference period Google [49]
 Workplaces daily state numeric percent change compared to reference period Google [49]
 Residential daily state numeric percent change compared to reference period Google [49]
 Transit stations daily state numeric percent change compared to reference period Google [49]
Awareness
 Searches corona daily state numeric percent relative to other states and observation period Google [24]
 COVID-19 burden daily district numeric reported cases on day of exposure RKI [48]
Weather
 Rainfall daily district numeric mm (l/sqm) DWD [50]
 Temperature daily district numeric °C DWD [50]
 Humidity daily district numeric relative humidity (%) DWD [50]
 Wind daily district numeric m/s DWD [50]
Interventions
 Ban of mass gatherings daily national binary - -
 School and kindergarten closures daily state numeric 0 for no closure, 1 for full closure, 0.5 for partial reopening -
 Contact restrictions daily national binary - -
 Mandatory face masks daily district binary - IZA [52]
Socio-demographic
 Age constant district numeric 2 variables: share of population > = 65 years & <18 years INKAR [51]
 Gender constant district numeric share of female population INKAR [51]
 Population density constant district numeric population per sqkm INKAR [51]
 Foreign citizens constant district numeric 2 variables: share of foreign citizens & of population seeking refuge INKAR [51]
 Socio-economic status constant district numeric share of households with low income INKAR [51]
 Turnout constant district numeric voter turnout in last election INKAR [51]
 Right-wing populist party votes constant district numeric share of votes for AfD in last election INKAR [51]
 Nursing homes constant district numeric number of nursing (retirement) homes INKAR [51]
Case numbers
 Reported new cases of COVID-19 daily district numeric - RKI [48]
 Active cases daily district numeric active cases on day of report RKI [48]

Variables

We downloaded German daily case numbers on district level reported by Robert Koch Institute (RKI, [48]) and aggregated them by date. The number of daily active cases for day d was derived by subtracting the total number of reported cases on day d and day d − 14 (14 days as a conservative estimate for the infectious period, which corresponds here to the required quarantine time in Germany).

To assess the mobility of the German population, we used data publicly available on German state level from Google [49]. Measurements are daily relative changes of mobility in percent compared to the period of 3 January 2020 to 6 February 2020. Missing values (25 out of 13 488) were imputed with value 0 and the state level measurements were passed onto districts within the corresponding state. Google mobility data was available for six different sectors of daily life (“retail and recreation”, “grocery and pharmacy”, “parks”, “transit stations”, “workplaces”, “residential”) which means that “mobility” is a construct consisting of several variables. All variables but “residential” mobility are relative changes of daily visitor numbers to the corresponding sectors compared to the reference period. “Residential” mobility is the relative change of daily time spent at residential areas. The six mobility variables showed high correlations among each other and with other variables. To reduce multicollinearity, we transformed them by principal component analysis (PCA) into six uncorrelated principal components which were used in place of the original variables.

The notion of awareness in the population of COVID-19 describes the general state of alertness about the new infectious disease. As such, it was hard to measure directly. As a proxy, we used the relative interest in the topic term “corona” as indicated by Google searches. The daily data was available on state level [24] and passed onto district level. As a second proxy for awareness, we used the daily reported number of COVID-19 cases on the day of the exposure: Since media reported case numbers prominently, we assumed that this could reflect individual awareness, too.

We constructed daily weather from four variables (“temperature”, “rainfall”, “humidity”, “wind”). Weather data was downloaded from Deutscher Wetterdienst (DWD, [50]) for all weather stations in Germany below 1000 meters altitude with daily records for our observation period. District level daily weather data was aggregated per district by averaging the data from the three nearest weather stations (which includes weather stations inside the district). Missing values were imputed with mean values (n = 59 for wind).

The reported number of COVID-19 cases varied strongly by day of the week. Thus, we included “weekday” as a categorical variable. Similarly, the reported cases and the exposure to the virus were affected by official holidays. Within the observation period, this included among others Good Friday, Easter Monday, and Labor Day. To correct for effects of these days, we included two variables in the model, “Holiday (report)” (indicates if the day of the report was a holiday, because governmental health departments were less likely to be on full duty) and “Holiday (exposure)” (indicates if the day of exposure to the virus was a holiday, because the population behaves differently on holidays).

For different official and political interventions on a daily basis and the district level we used one-hot encoded daily variables, i.e. ban of mass gatherings, school and kindergarten closures and their gradual reopening, contact restrictions, and mandatory face masks for shopping and public transport.

We included several social, economic, and demographic factors on the district level with direct or indirect influence on the risk of exposure to SARS-CoV-2 in our analysis. All are readily available from INKAR database [51]. We used the share of population that is 65 years or older and the share of population that is younger than 18 years (Age), the share of females in population (Gender), the population density, the share of foreign citizenships and the share of the population seeking refuge (Foreign citizenship), the share of low-income households (Socio-economic status), voter turnout, share of right-wing populist party votes, and the number of nursing (retirement) homes.

All continuous variables but the outcome “Reported new cases of COVID-19” and the offset “Active cases” were centered and scaled by one standard deviation for numerical stability, while we left binary variables as-is. After estimating the effects of variables, we re-scaled continuous variables’ effects to their original scale. Additionally for mobility variables, we re-transformed the effects of the principal components to the original mobility variables. Furthermore, we lagged the effect of all variables (but outcome, offset, and the non-dynamic socio-demographic variables) by 5 days (optimal lag found by cross-validation) which means that we assumed that their effects on the outcome will be visible after 5 days.

Methods

Causal analysis with DAG and adjustment sets

We used a directed acyclic graph as a graphical representation of the hypothesized causal reasoning that leads to exposure to the SARS-CoV-2 virus, onset of COVID-19, and finally reports of COVID-19 cases. We use the terms “causal effect” or “causal relationship” for effect estimates that are based on this causal graph framework. Every node vi in the graph is the graphical representation of an observed or unobserved variable xi, a directed edge eij is an arrow from node vi to vj that implies a direct causal relationship from variable xi onto variable xj. The set of all nodes is denoted by V, the set of all edges by E, as such, the complete DAG is the tuple G = (V, E). The seminal works of Spirtes and Pearl [53, 54] introduce the theory of causal analysis, do-calculus, and how to analyze a DAG to estimate the total or direct causal effect from a variable xi onto a variable xj. The direct effect is the effect associated with the edge eij only (if it exists), while the total effect takes indirect effects via other paths from vi to vj into account, too. Here we estimated total effects only, since most of our variables were not hypothesized to have a direct effect on the reported number of new COVID-19 cases. In contrast to prediction tasks, where one would include all variables available, it is actually ill-advised to use all available variables to estimate causal effects, due to introducing bias by adjusting for unnecessary variables within the causal DAG. This is why we need to identify a valid set of necessary variables (an adjustment set) to estimate the proper causal effect [54]. The “minimal adjustment set” [55] is a valid adjustment set of variables that does not contain another valid adjustment set as a subset. However, identifying a minimal adjustment set might not be enough to reliably estimate the causal effect. Thus, we identified the “optimal adjustment set” [56] as the set of variables which is a valid adjustment set while having the lowest Akaike information criterion (AIC).

We analyzed the DAG from Fig 1 with the R Software [57] and the R packages dagitty (formal representation of the graph and minimal adjustment sets [12]) and pcalg (for finding an optimal adjustment set [58]). For the defined exposures and the outcome “Reported new cases of COVID-19”, we computed the minimal and optimal adjustment sets. Since it was possible that these sets contained unobserved variables that needed to be left out of the regression model, we chose the valid set with the lowest AIC (see next section) to estimate the final total causal effect from exposure to outcome.

Regression with negative binomial model

We can estimate the causal effect from exposure to outcome by regression [54]. Since the outcome “Reported new cases of COVID-19” is a count variable, one should not employ a linear regression model with Gaussian errors, but instead we assumed a log-linear relationship between the expected value of the outcome Y (new cases) and regressors x, as well as a Poisson or negative binomial distribution for Y:

log(E[Y|x])=α+iSβi·xi, (1)

where α is the regression intercept, S is the set of adjustment variables for the exposure i* including the exposure variable itself, βi are the regression coefficients corresponding to the variables xi. As such βi* is the total causal effect from exposure variable xi* on the outcome Y.

The Poisson regression assumes equality of mean and variance. If this is not the case one observes so-called overdispersion (the variance is higher than the mean), this indicates one should use regression with a negative binomial distribution instead to estimate the variance parameter separately from the mean.

We needed to account for the fact that our outcome is not counted per time unit (one day) only, but depends on the number of active COVID-19 cases: Holding all other variables fixed, the number of new cases Y is a constant proportion of the number of active cases A. This was modeled by including an offset log(A + 1) in the regression model Eq (1):

log(E[Y|x])=α+log(A+1)+iSβi·xi
log(E[Y|x]A+1)=α+βi·xi (2)
E[Y|x]A+1=exp(α)·exp(βi)xi. (3)

Here we added a pseudocount “+1” to ensure a finite logarithm and avoid division by 0.

One can interpret the model as approximating the log-ratio of new cases and active cases by a linear combination of the regressor variables in Eq (2). If all variables xi are centered in Eq (3), we have for the baseline ixi=0E[Y|x=0]=exp(α)(A+1). In other words, the exponentiated intercept is the baseline daily infection rate (how many people does one infected individual infect in one day). If we hold all variables xi fixed (e.g. at baseline 0) in Eq (3) but now increase the exposure variable xi* = 0 by one unit to xi* + 1 = 0 + 1, we have

E[Y|x]=exp(α)·(A+1)·exp(βi*xi*+1)ii*exp(βi)0=exp(α)·(A+1)·exp(βi*),

which means the exponentiated coefficient βi* describes the rate change of the outcome by one unit increase of the exposure.

In practice, given observations of Y and x we estimate the regression coefficients α and βi by maximum likelihood [59]. Our observational measurements are ykt and xikt, where k indicates the corresponding district and t the date of measurement.

We conducted a log-linear regression (function glm with family = poisson() for Poisson regression, and glm.nb from the MASS package for the negative binomial regression [60]) for the full data set to assess general model adequacy and to estimate the θ parameter of the negative binomial. The proper lag between exposures and outcome was found by 10-fold cross-validation on different lags between 1 and 20 days. Model diagnostics on the final full model did not show severe problems with model assumptions (linearity, distribution of residuals, independence of observations). Analysis of variance inflation factors revealed some problems with multicollinearity. To reduce the effects of multicollinearity, first we transformed the highly correlated mobility variables by PCA as described above. Second, we used a ridge regression approach [61], which is a regularization method that shrinks regression coefficients and alleviates the effect of correlation between variables on their respective regression coefficients. Furthermore, regularized regression allows for better fits on unseen data, thus preventing overfitting the data, too. The hyper-parameter λ of the ridge regression was chosen by 10-fold cross-validation, where the folds were constructed from random subsets of the 401 districts. We used this hyper-parameter with the cv.glmnet function from the R package glmnet [62] with family = negative.binomial(theta) and chose the λ value within one standard deviation from the minimal λ as regularization hyper-parameter. Afterwards, we calculated the effects of separate exposures on the outcome. For every exposure, we analyzed the different valid adjustment sets given by analysis of the causal DAG (i.e. the minimal and optimal adjustment sets). Then, we first checked if the respective set included unobserved variables. If this was the case for the optimal adjustment set, we discarded the unobserved variables from the set and checked if it was still a valid adjustment set (function gac in package pcalg [63]). If a minimal adjustment set contained unobserved variables, we discarded the whole set. If no valid adjustment set for a given exposure was available, we concluded that the effect of this exposure was unidentifiable within our causal graph. We used the function glmnet with the parameters θ and λ as above on every remaining valid adjustment set as regressors (that is, we applied ridge regression) and calculated the Akaike information criterion (AIC) for this model/set of regressors. Finally, for every exposure, we decided for the model/adjustment set (if available) with the lowest AIC. We report the exponentiated estimated coefficients for the separate exposures on their original scale.

Results

Descriptive statistics for the included variables are presented in Table 2.

Table 2. Descriptive statistics for observed variables.

Variable mean (SD)
n 56140
Mobility
 Retail and recreation -26.62 (24.60)
 Grocery and pharmacy -3.94 (22.77)
 Parks 47.26 (58.20)
 Workplaces -22.96 (20.35)
 Residential 8.13 (6.49)
 Transit stations -29.58 (21.11)
Awareness
 Searches corona 26.94 (18.23)
 COVID-19 burden 3.50 (10.28)
Weather
 Rainfall 1.89 (4.01)
 Temperature 10.90 (5.33)
 Humidity 67.81 (13.03)
 Wind 3.63 (1.66)
Interventions
 Ban of mass gatherings 0.83 (0.38)
 School and kindergarten closures 0.54 (0.36)
 Contact restrictions 0.74 (0.44)
 Mandatory face masks 0.49 (0.50)
Socio-demographic
 Age (pop. 65 and older) 22.09 (2.74)
 Age (pop. younger 18) 16.17 (1.25)
 Gender 50.59 (0.64)
 Population density 533.75 (701.84)
 Foreign citizens 10.03 (5.14)
 Foreign citizens (refugees) 1.88 (1.14)
 Socio-economic status 30.64 (6.02)
 Turnout 75.08 (3.79)
 Right-wing populist party votes 13.39 (5.32)
 Nursing homes 36.11 (30.69)
Case numbers (Outcome and offset)
 Reported new cases COVID-19 3.53 (10.29)
 Active cases 48.76 (120.86)

In the observational period, the number of daily reported COVID-19 cases increased till the end of March/beginning of April and continually decreased afterwards till the beginning of June 2020 with a slight increase and decrease afterwards (Fig 2A). On the other hand, the (log-)ratio of reported cases over active cases decreased steeply till the mid of April and increased steadily afterwards with a slight decrease close to the end of the observation period (Fig 2B). Both figures examplify a considerable variation among the districts (light blue points are individual district’s data).

Fig 2. Temporal and district level variation of outcome (log-scale).

Fig 2

In Germany, we observed a rebound in mobility after the initial political measures, reductions in incident cases were associated with a diminishing public interest in COVID-19, and temperatures were overall increasing (cf. Fig 3); with correlations between temporal progression and mobility in retail and recreation rA,B = 0.02, awareness (“Searches corona”) rA,C = -0.3, and temperature rA,D = 0.8.

Fig 3. Temporal variation of outcome and main determinants.

Fig 3

Main results

We list the results of our causal analysis for the effects of different exposure variables in Table 3. The estimates are multiplicative rates of increase/decrease for a one unit increase of the respective variable: Values above 1 lead to an increase, below 1 to a decrease of the infection rate. To put these estimates into perspective, Fig 4 shows the relative causal effect of the different exposure variables on the number of reported COVID-19 cases on a range of sensible values of the exposure variables (95 percent quantiles of data points).

Table 3. Effect estimates from causal graph analysis.

Cause Effect estimate
Mobility
 Retail and recreation 1.0011
 Grocery and pharmacy 0.9977
 Parks 0.9997
 Transit stations 1.0026
 Workplaces 1.0033
 Residential 0.9903
Awareness
 Searches corona 1.0089
 COVID-19 burden 0.9980
Weather
 Temperature 0.9905
 Rainfall 1.0121
 Humidity 1.0057
 Wind 1.0329
Interventions
 Interventions (ban of mass gatherings) 0.9729
 Interventions (school and kindergarten closures) 0.9277
 Interventions (contact restrictions) 0.8314
 Interventions (mandatory face masks) 0.9064
Demographic
 Age (pop. 65 and older) 0.9953
 Age (pop. younger 18) 1.0120
 Foreign citizens 1.0048
 Foreign citizens (refugees) 0.9985
 Gender 0.9925
 Nursing homes 1.0011
 Population density 1.0000
 Socio-economic status 0.9982

Fig 4. Relative causal effects of exposures.

Fig 4

Within our framework, we saw very different effects for individual mobility variables. For mobility in retail/recreation, an increase of 1 percent point mobility compared to the reference period (03 January to 06 February 2020) leads to an increase of the daily reported case number by about 0.11 percent. Similarly, mobility on workplaces showed an effect of 0.33 increase in case numbers for every 1 percent point increase in mobility, while mobility on transit stations showed an effect of 0.26 increase in case numbers for every 1 percent point increase. Contrarily, the remaining three mobility variables showed negative effects on the number of reported COVID-19 cases. An increase of 1 percent point mobility for the areas of grocery/pharmacy leads to a decrease in the reported case number by approximately 0.23 percent, while increased mobility of 1 percent point within parks leads to a decrease in the reported case number by approximately 0.03 percent, and finally an increase of 1 percent point in residential mobility leads to a decrease by approximately 0.97 percent. Fig 4 shows the effects of mobility on a range of possible values. Thus, we expect an increase of daily cases by approximately 7.8 percent if mobility in workplaces reaches baseline levels of 0 percent difference to the reference period. On the other hand, an increase of mobility for residential areas by 10 percent points compared to the reference period leads to a reduction of the infection rate by approximately 1.8 percent.

“Awareness” had two opposite effects on the outcome in our DAG. Awareness measured by Google searches for corona had a positive effect on the number of reported cases. An one percent point increase of the state’s Google searches (relative to other states and the observation period) leads to an increase of approximately 0.89 percent. For example, if a district shows 10 percent points more relative searches for corona than another one, we expect approximately 9.3 percent more infections for this district after 5 days. COVID-19 burden (reported number of cases on day of exposure) affected the outcome negatively, where every additional daily case in the district leads to a 0.2 percent decrease in newly reported case numbers. The corresponding plot in Fig 4 visualizes this relationship: For a local outbreak with 20 daily cases as COVID-19 burden, we estimate as total causal effect a subsequent reduction of infection rate by 3.9 percent.

Within our model, we observed effects of temperature and all other weather variables. Every increase of 1 degree Celsius in temperature leads to a reduction of the daily reported case numbers by approximately 0.95 percent. On the other hand, we found an increasing effect of rainfall: One millimeter (=1 liter per square meter) more rainfall leads to an increase of reported case numbers by approximately 1.21 percent. We observe effects for humidity and wind as well (higher humidity and stronger wind leading to more cases). In perspective (Fig 4), with temperature we expect an increase by approximately 21 percent at a daily average temperature of 0°C compared to a day with 20°C. For rainfall, we expect on a rainy day with 10 mm rainfall a corresponding increase of the infection rate by approximately 12.8 percent compared to a day with no precipitation.

The different intervention variables showed the strongest effects in our analysis, see Table 3. While the first intervention (ban of mass gatherings) reduced subsequent daily case numbers by 2.7 percent, the closure of schools/kindergartens reduced infections by an additional 7.2 percent and mandatory face masks reduced this by another 9.4 percent. The effect of contact restrictions was the strongest in our observation period, with an reduction of the case rates by 16.9.

The effects of the different socio-demographic factors are quite small in comparison to the effects described above. We see an increasing effect on case numbers by additional nursing homes between districts. Districts with a younger population, more foreign citizens, higher population density and a lower average social-economic status showed higher case numbers, too.

For all exposures, our analysis pipeline opted to use the (reduced) optimal adjustment set over the minimal adjustment sets because of lower AICs, except for exposure variable “nursing homes”, for which the minimal adjustment set had the lowest AIC. For an overview of all final adjustments sets, see Table 4. We found that there were no valid adjustment sets for the non-identifiable variables turnout and right-wing populist party votes.

Table 4. Final adjustment sets for causal analysis.

Mobility Searches corona COVID-19 burden Temperature Rainfall Humidity Wind Interventions Age Foreign citizens Gender Nursing homes Population density Socio-economic status
Weekday x x x x x x x x x x x x x x
Holiday (report) x x x x x x x x x x x x x x
Holiday (exposure) x x x x x x x x x x x x x x
Mobility
 Mobility x
Awareness
 Searches corona x x x x x x
 COVID-19 burden x x x x x x x x x x x x x
Weather
 Temperature x x x x x x x x x x
 Rainfall x x x x x x x x x x x x
 Humidity x x x x x x x x x
 Wind x x x x x x x x x x x x
Interventions
 Interventions x x x x x x x x x x x x
Socio-demographic
 Age x x x x x x x x x x x
 Gender x x x x x x x x x x x x x
 Population density x x x x x x x x x x x x x
 Foreign citizens x x x x x x x
 Socio-economic status x x x x x x x x x x
 Turnout x x
 Right-wing populist party votes x x x x x x x x x
 Nursing homes x x x x x x x x x

We decided for a lag of 5 days based on cross-validation. Similarly, negative binomial regression was chosen over Poisson regression, because the latter showed overdispersion and an higher AIC value.

Discussion

Main findings

Our objective was to identify effects of determining factors for COVID-19 cases within a causal framework. We found that weather affects the reported number of infections, especially temperature (which has a reducing effect on case numbers) and rainfall (which increases case numbers). We saw that reports of high case numbers in districts led to a reduction in new infection numbers, which indicates risk-averse awareness in the population and/or effective public health measures to suppress a local outbreak. Mobility showed distinct effects: Increasing activity in retail and recreational areas, as well as transit stations and workplaces increased reported case numbers, while increased movement for essential shopping (grocery and pharmacy) and in parks or residential areas led to reduced case numbers. All interventions considered (ban of mass gatherings, school/kindergarten closures, contact restrictions, mandatory face masks) reduced case numbers considerably. Socio-demographic variables had small effects individually, but in conjunction they explained larger case numbers in (urban) areas with younger population, lower socio-economic status, and higher population density.

Furthermore, we made a strong case for the use of causal DAGs in epidemiology and a pandemic like COVID-19: DAGs allow to choose confounders for the analysis in a principled and statistically correct way while reducing possible causes for bias. Also, the DAG formalization allows for discussion about the underlying causal assumptions.

Comparison with previous research

Most research on determinants affecting case numbers of COVID-19 is restricted to single aspects [5, 16, 32, 35]. To reliably identify (causal) drivers, one must adjust for confounders. To this end, we used an integrated model with variables from different aspects like mobility, awareness, weather, or socio-demographics and identified confounders by causal analysis with a directed acyclic graph. A causal approach is used in another current COVID-19 analysis [64]. There, however, they identify the causal relationships (reconstruct a DAG), while we estimated effects for a given hypothesized causal DAG.

Several studies assessing the impact of public health measures on mobility have each observed a downward trend accompanied by a decrease in the number of newly reported cases [1517, 19, 23, 47].

Our findings regarding awareness/Google Trends analysis are in good agreement with the correlations found by others [4, 6, 26], who conclude that alertness to COVID-19 rises several days before the highest number of cases are reported. At this point it should be noted, that awareness is substantially influenced by public media coverage, which should be considered, if possible, in future studies [4]. As such, awareness is difficult to measure and here the number of Google searches for “corona” could only be a proxy for this concept.

In addition, in alignment with other recent published studies, our results confirm evidence which associated a negative effect of temperature on new COVID-19 cases [79, 3136]. It is however controversial to other scientific literature describing no effects [22, 3942] or even converse correlations [37, 38]. The conflicting results might be explained by different climates and characteristics of the populations under study. While we are confident that our strict causal analysis resulted in effect estimates as undistorted as possible, there might be unconsidered bias in those other studies. Further research needs to be done to elucidate the biological characteristics of the novel virus SARS-CoV-2 regarding its ambient temperature survival and transmission. Finally, we found a positive effect of increased precipitation and a raise in COVID-19 cases, which supports previous observations [33].

A recent review on COVID-19 based on evidence from the US and UK concludes that low socio-economic status groups are being hit harder by the pandemic [65]. Albeit specific pathways remain unclear, many studies found associations with poverty or its correlates such as poor and potentially overcrowded housing conditions. For Germany, a higher case fatality of COVID-19 cases in districts with higher socio-economic deprivation has also been reported just recently, which was especially pronounced in the second wave of the pandemic [66]. Similarly, our analysis identified a decreasing effect on COVID-19 case numbers within districts with a higher socio-economic status during the first wave.

Limitations and strengths

While use of a causal DAG is itself a strong tool to identify causal effects (and not just statistical associations), it introduces two limitations: causal assumptions within the graph (depicted by edges) need to be well justified, and the statistical regression model that calculates total causal effects needs to be appropriate for the task at hand. We endorse our graph as a basis for discussion on residual confounding. We did not try to construct the DAG from the available data (cf. [64]). As such, our proposed DAG is not entirely consistent with the data and there are conditional dependencies between variables that cannot be dissolved by adding edges to the DAG (e.g. between the interventions like contact restrictions and mandatory face masks). Another way to identify potential problems in the proposed DAG is to perform a sensitivity analysis of its structure by inspecting its maximal ancestral graph (MAG) or its Markov equivalence class represented by a complete partially DAG (CPDAG) and the existence of valid adjustment sets for these generalized graphs [67]. For the MAG derived from our DAG, only the effects for exposures mobility and searches for corona can be estimated with valid adjustment sets, while for the Markov equivalence class all exposures but COVID-19 burden lead to valid adjustments sets. A further analysis of these implications is out of the scope of this paper.

We observed overdispersion and a substantial increase in model performance with a negative binomial regression compared to Poisson regression, which is in line with the results on COVID-19 daily case counts of [17] and others [7, 9, 68]. We did not model case counts with a differential equation model like the classic SIR-model [69] and its successors, since these are more suited to prediction e.g. [70], while our choice of a negative binomial regression framework allowed us to estimate the effects of confounders more reliably. There are more advanced statistical methods for count data, e.g. zero-inflated models and mixed models. We tested both approaches as extensions to the negative binomial regression and experienced numerical problems and increased computing time, along with an insubstantial increase in model performance. Furthermore, our model assumed that all variables have effects proportional to the size of their measurements. It is possible that some variables show saturation effects or opposite effects for low, medium, or high values. This could be modeled with polynomial or other transformations of the variables, which we did not employ due to limited temporal and spatial data availability. Interaction effects of variables and confounding effects or mediating variables are explicitly taken care of by deriving the valid adjustment sets for a given exposure based on the causal DAG. Use of a fixed DAG with effect estimation via regression assumes that data was generated by the same underlying process for the observation period. By inclusion of the successive mitigation interventions as binary variables we were able to explain some of the variance caused by the changing dynamics of case numbers (similar to [68]). While multicollinearity of variables poses less of a problem for a proper causal graph analysis [71], we addressed the problem of multicollinearity in our predictors by two approaches: principal component analysis for the highly collinear mobility variables as well as a regularized regression approach (ridge regression). The latter (in conjunction with cross-validation) also reduced the problem of overfitting.

We stress the point that our effects were deduced on an aggregate (district) level in the absence of available data on an individual level. As such, conclusions about effects cannot be transferred on individuals without the possibility for an ecological fallacy. Furthermore, as we were using administrative data for our analysis, the results are susceptible to the Modifiable Area Unit Problem (MAUP) [72]. The MAUP postulates that different regional aggregations of the units of observation may lead to different results and conclusions. Due to limited available data for the different variables, there is currently no way to overcome these problems that are inherent to all analyses on aggregated data level.

Our observation period was restricted to succession from late winter to spring and summer (February to July). Nevertheless, this transition with increasing temperature was a natural experiment that allowed clues on weather effects.

We could not include data on health care utilization during the pandemic into our models due to the lack of available resources. This is planned for a later follow up to this paper since we rank health care utilization and mobility within health care facilities among the strong factors for COVID-19 progression: personnel in hospitals and private practices is particularly exposed to infection, while the lack of adequate care for other diseases has severe effects on general health of the population. At the same time, health care facilities are key for testing and surveillance of COVID-19 patients.

Social determinants of health are important factors to consider in an epidemiological framework of a pandemic disease like COVID-19. To account for this problem, we included several socio-economic confounders that were available on a district level in Germany. While our analysis is not an exhaustive analysis of the effects of social determinants on COVID-19 infections, we emphasize the necessity of their inclusion and our results add to the growing body of evidence that these factors interact with each other and cluster especially among people or within areas of underprivileged conditions, with detrimental effects on population health [73].

While our analysis focused on Germany and its districts, we assume that results may be transferred to other countries by adjusting for their respective weather conditions, mobility habits, socio-demographic characteristics, and other determining factors.

The code and resources for our analysis are available on Github, we invite other researchers to replicate our analysis with different assumptions using the files provided in the repository of the article (https://github.com/zidatalab/causalcovid19).

Discussion of causal effects

In our analysis, the adverse effects of mobility in retail/recreation and workplaces and the favorable effect of mobility in grocery/pharmacy and residential areas indicate that interventions like contact restrictions which limit the number of individual interactions can lead to reduced infection numbers. This is due to retail/recreational and workplace areas encompassing mostly places of (social) gatherings, while if people are doing more of their essential shopping at supermarkets and stay at home with less contact to other people, they are less likely to come in contact with infected individuals.

The effects of awareness measured via searches for “corona” and the COVID-19 burden are harder to interpret. We assume that within our model, the searches for “corona” are an insufficient proxy for awareness, while the decreasing effect for future case numbers of high daily COVID-19 burden indicates it affects individual risk-behavior and entails effective non-pharmaceutical interventions.

Similarly, the effects of temperature and rainfall can be interpreted as causal effects for indoor and outdoor activities, such that higher temperatures and low rainfall indicate more people spending time outdoor while lower temperatures and high rainfall result in indoor activities, which lead to more infections. Current research suggests this to be due to the prevalent airborne and respiratory droplets and aerosol transmission of the SARS-CoV-2 virus [74]. In this light, we advocate for precautious measures like increased hygiene, face masks, and air ventilation for unavoidable indoor activities.

Furthermore, our analyses strongly support the effectiveness of non-pharmaceutical interventions. To a lesser extent, the adverse effects of some socio-demographic factors might help to identify areas that are at higher risk of local COVID-19 outbreaks and more severe outcomes of infection cases.

Conclusion

To the best of our knowledge, this is the most comprehensive analysis of causes for COVID-19 infections which integrates different data sources (all publicly available). Causal reasoning with a DAG allows us to estimate the possible causal effects more reliably.

Our findings suggest that the infection-driving effects of mobility, awareness, and weather (and to some extent socio-demographic factors) need to be taken into account when deciding for mitigation and suppression interventions, depending on the recent and future COVID-19 pandemic development.

Acknowledgments

We are thankful for feedback from Thomas Czihal, Johannes Textor, Ralph Brinks, and an anonymous reviewer who gave helpful suggestions on earlier versions of the manuscript.

Data Availability

All relevant data are available from public sources and are aggregated in the github repository pertaining to the manuscript: https://github.com/zidatalab/causalcovid19.

Funding Statement

The authors received no specific funding for this work.

References

  • 1.WHO Team. Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19); 2020, accessed 2020-06-25. Available from: https://www.who.int/publications-detail/report-of-the-who-china-joint-mission-on-coronavirus-disease-2019-(covid-19).
  • 2. Chinazzi M, Davis JT, Ajelli M, Gioannini C, Litvinova M, Merler S, et al. The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak. Science. 2020;368(6489):395–400. 10.1126/science.aba9757 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Guan W, Ni Z, Hu Y, Liang W, Ou C, He J, et al. Clinical Characteristics of Coronavirus Disease 2019 in China. New England Journal of Medicine. 2020; 10.1056/NEJMoa2002032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Higgins TS, Wu AW, Sharma D, Illing EA, Rubel K, Ting JY. Correlations of Online Search Engine Trends With Coronavirus Disease (COVID-19) Incidence: Infodemiology Study. JMIR Public Health and Surveillance. 2020;6(2):e19702. 10.2196/19702 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Li C, Chen LJ, Chen X, Zhang M, Pang CP, Chen H. Retrospective analysis of the possibility of predicting the COVID-19 outbreak from Internet searches and social media data, China, 2020. Euro Surveillance. 2020;25(10). 10.2807/1560-7917.ES.2020.25.10.2000199 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Yuan X, Xu J, Hussain S, Wang H, Gao N, Zhang L. Trends and Prediction in Daily New Cases and Deaths of COVID-19 in the United States: An Internet Search-Interest Based Model. Exploratory research and hypothesis in medicine. 2020;5(2):1–6. 10.14218/ERHM.2020.00023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Bannister-Tyrrell M, Meyer A, Faverjon C, Cameron A. Preliminary evidence that higher temperatures are associated with lower incidence of COVID-19, for cases reported globally up to 29th February 2020. medRxiv. 2020; 10.1101/2020.03.18.20036731 [DOI] [PMC free article] [PubMed]
  • 8. Demongeot J, Flet-Berliac Y, Seligmann H. Temperature Decreases Spread Parameters of the New Covid-19 Case Dynamics. Biology. 2020;9(5). 10.3390/biology9050094 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Liu J, Zhou J, Yao J, Zhang X, Li L, Xu X, et al. Impact of meteorological factors on the COVID-19 transmission: A multi-city study in China. Science of the Total Environment. 2020;726:138513. 10.1016/j.scitotenv.2020.138513 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Greenland S, Robins JM, Pearl J. Confounding and Collapsibility in Causal Inference. Statistical Science. 1999;14(1):29–46. 10.1214/ss/1009211805 [DOI] [Google Scholar]
  • 11. Schipf S, Knüppel S, Hardt J, Stang A. Directed Acyclic Graphs (DAGs)—Die Anwendung kausaler Graphen in der Epidemiologie. Gesundheitswesen. 2011;73(12):888–892. 10.1055/s-0031-1291192 [DOI] [PubMed] [Google Scholar]
  • 12. Textor J, van der Zander B, Gilthorpe MS, Liśkiewicz M, Ellison GT. Robust causal inference using directed acyclic graphs: the R package ‘dagitty’. International Journal of Epidemiology. 2017;45(6):1887–1894. 10.1093/ije/dyw341 [DOI] [PubMed] [Google Scholar]
  • 13.Center for Systems Science and Engineering (CSSE). COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University; 2020. Available from: https://github.com/CSSEGISandData/COVID-19.
  • 14. Pearl J, Bareinboim E. External Validity: From Do-Calculus to Transportability Across Populations. Statistical Science. 2014;29(4):579–595. 10.1214/14-STS486 [DOI] [Google Scholar]
  • 15. Chang MC, Kahn R, Li YA, Lee CS, Buckee CO, Chang HH. Variation in human mobility and its impact on the risk of future COVID-19 outbreaks in Taiwan. BMC Public Health. 2021;21(1):226. 10.1186/s12889-021-10260-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Fowler JH, Hill SJ, Obradovich N, Levin R. The Effect of Stay-at-Home Orders on COVID-19 Cases and Fatalities in the United States. medRxiv. 2020; 10.1101/2020.04.13.20063628 [DOI] [PMC free article] [PubMed]
  • 17. Kraemer MUG, Yang CH, Gutierrez B, Wu CH, Klein B, Pigott DM, et al. The effect of human mobility and control measures on the COVID-19 epidemic in China. Science (New York, NY). 2020;368(6490):493–497. 10.1126/science.abb4218 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Lasry A, Kidder D, Hast M, Poovey J, Sunshine G, Winglee K, et al. Timing of Community Mitigation and Changes in Reported COVID-19 and Community Mobility—Four U.S. Metropolitan Areas, February 26-April 1, 2020. MMWR Morbidity and mortality weekly report. 2020;69(15):451—457. 10.15585/mmwr.mm6915e2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Linka K, Peirlinck M, Sahli Costabal F, Kuhl E. Outbreak dynamics of COVID-19 in Europe and the effect of travel restrictions. Computer Methods in Biomechanics and Biomedical Engineering. 2020; p. 1–8. 10.1080/10255842.2020.1759560 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Mazzoli M, Mateo D, Hernando A, Meloni S, Ramasco JJ. Effects of mobility and multi-seeding on the propagation of the COVID-19 in Spain. medRxiv. 2020; 10.1101/2020.05.09.20096339 [DOI]
  • 21.Xiong C, Hu S, Yang M, Younes HN, Luo W, Ghader S, et al. Data-Driven Modeling Reveals the Impact of Stay-at-Home Orders on Human Mobility during the COVID-19 Pandemic in the U.S. arXiv e-prints. 2020; p. arXiv:2005.00667.
  • 22. Jüni P, Rothenbühler M, Bobos P, Thorpe KE, da Costa BR, Fisman DN, et al. Impact of climate and public health interventions on the COVID-19 pandemic: A prospective cohort study. Canadian Medical Association Journal. 2020; 10.1503/cmaj.200920 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Lai S, Ruktanonchai NW, Zhou L, Prosper O, Luo W, Floyd JR, et al. Effect of non-pharmaceutical interventions to contain COVID-19 in China. Nature. 2020; 10.1038/s41586-020-2293-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Google LLC. Google Trends, search term “corona”; 2020, accessed 2020-06-25. Available from: https://www.google.com/trends.
  • 25. Ayyoubzadeh SM, Ayyoubzadeh SM, Zahedi H, Ahmadi M, R Niakan Kalhori S. Predicting COVID-19 Incidence Through Analysis of Google Trends Data in Iran: Data Mining and Deep Learning Pilot Study. JMIR Public Health and Surveillance. 2020;6(2):e18828. 10.2196/18828 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Effenberger M, Kronbichler A, Shin JI, Mayer G, Tilg H, Perco P. Association of the COVID-19 pandemic with Internet Search Volumes: A Google Trends(TM) Analysis. International Journal of Infectious Diseases. 2020;95:192–197. 10.1016/j.ijid.2020.04.033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Lin YH, Liu CH, Chiu YC. Google searches for the keywords of “wash hands” predict the speed of national spread of COVID-19 outbreak among 21 countries. Brain, Behavior, and Immunity. 2020; 10.1016/j.bbi.2020.04.020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Mavragani A. Tracking COVID-19 in Europe: Infodemiology Approach. JMIR Public Health and Surveillance. 2020;6(2):e18941. 10.2196/18941 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Walker A, Hopkins C, Surda P. Use of Google Trends to investigate loss-of-smell-related searches during the COVID-19 outbreak. International Forum of Allergy & Rhinology. 2020;10(7):839–847. 10.1002/alr.22580 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Zhou WK, Wang AL, Xia F, Xiao YN, Tang SY. Effects of media reporting on mitigating spread of COVID-19 in the early phase of the outbreak. Mathematical Biosciences and Engineering. 2020;17(3):2693–2707. 10.3934/mbe.2020147 [DOI] [PubMed] [Google Scholar]
  • 31. Qi H, Xiao S, Shi R, Ward MP, Chen Y, Tu W, et al. COVID-19 transmission in Mainland China is associated with temperature and humidity: A time-series analysis. Science of the Total Environment. 2020;728:138778. 10.1016/j.scitotenv.2020.138778 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Shi P, Dong Y, Yan H, Zhao C, Li X, Liu W, et al. Impact of temperature on the dynamics of the COVID-19 outbreak in China. Science of the Total Environment. 2020;728:138890. 10.1016/j.scitotenv.2020.138890 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Sobral MFF, Duarte GB, da Penha Sobral AIG, Marinho MLM, de Souza Melo A. Association between climate variables and global transmission of SARS-CoV-2. Science of the Total Environment. 2020;729:138997. 10.1016/j.scitotenv.2020.138997 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Tosepu R, Gunawan J, Effendy DS, Ahmad LOAI, Lestari H, Bahar H, et al. Correlation between weather and Covid-19 pandemic in Jakarta, Indonesia. Science of the Total Environment. 2020;725:138436. 10.1016/j.scitotenv.2020.138436 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Wang M, Jiang A, Gong L, Luo L, Guo W, Li C, et al. Temperature significant change COVID-19 Transmission in 429 cities. medRxiv. 2020; 10.1101/2020.02.22.20025791 [DOI]
  • 36. Wu Y, Jing W, Liu J, Ma Q, Yuan J, Wang Y, et al. Effects of temperature and humidity on the daily new cases and new deaths of COVID-19 in 166 countries. Science of the Total Environment. 2020;729:139051. 10.1016/j.scitotenv.2020.139051 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Auler AC, Cássaro FAM, da Silva VO, Pires LF. Evidence that high temperatures and intermediate relative humidity might favor the spread of COVID-19 in tropical climate: A case study for the most affected Brazilian cities. Science of the Total Environment. 2020;729:139090. 10.1016/j.scitotenv.2020.139090 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Xie J, Zhu Y. Association between ambient temperature and COVID-19 infection in 122 cities from China. Science of the Total Environment. 2020;724:138201. 10.1016/j.scitotenv.2020.138201 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Briz-Redón Á, Serrano-Aroca Á. A spatio-temporal analysis for exploring the effect of temperature on COVID-19 early evolution in Spain. Science of the Total Environment. 2020;728:138811. 10.1016/j.scitotenv.2020.138811 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Iqbal N, Fareed Z, Shahzad F, He X, Shahzad U, Lina M. The nexus between COVID-19, temperature and exchange rate in Wuhan city: New findings from partial and multiple wavelet coherence. Science of the Total Environment. 2020;729:138916. 10.1016/j.scitotenv.2020.138916 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Jahangiri M, Jahangiri M, Najafgholipour M. The sensitivity and specificity analyses of ambient temperature and population size on the transmission rate of the novel coronavirus (COVID-19) in different provinces of Iran. Science of the Total Environment. 2020;728:138872. 10.1016/j.scitotenv.2020.138872 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Yao Y, Pan J, Liu Z, Meng X, Wang W, Kan H, et al. No association of COVID-19 transmission with temperature or UV radiation in Chinese cities. The European Respiratory Journal. 2020;55(5). 10.1183/13993003.00517-2020 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.de Lusignan S, Dorward J, Correa A, Jones N, Akinyemi O, Amirthalingam G, et al. Risk factors for SARS-CoV-2 among patients in the Oxford Royal College of General Practitioners Research and Surveillance Centre primary care network: a cross-sectional study. The Lancet Infectious Diseases; 10.1016/S1473-3099(20)30371-6 [DOI] [PMC free article] [PubMed]
  • 44. Wahrendorf M, Rupprecht CJ, Dortmann O, Scheider M, Dragano N. Erhöhtes Risiko eines COVID-19-bedingten Krankenhausaufenthaltes für Arbeitslose: Eine Analyse von Krankenkassendaten von 1,28 Mio. Versicherten in Deutschland. Bundesgesundheitsblatt—Gesundheitsforschung—Gesundheitsschutz. 2021;64(3):314–321. 10.1007/s00103-021-03280-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Dohle S, Wingen T, Schreiber M. Acceptance and Adoption of Protective Measures During the COVID-19 Pandemic: The Role of Trust in Politics and Trust in Science. Social Psychological Bulletin. 2020;15(4):1–23. 10.32872/spb.4315 [DOI] [Google Scholar]
  • 46.Engle S, Stromme J, Zhou A. Staying at home: mobility effects of COVID-19. Available at SSRN. 2020; 10.2139/ssrn.3565703 [DOI]
  • 47. Cowling BJ, Ali ST, Ng TWY, Tsang TK, Li JCM, Fong MW, et al. Impact assessment of non-pharmaceutical interventions against coronavirus disease 2019 and influenza in Hong Kong: an observational study. The Lancet Public Health. 2020;5(5):e279–e288. 10.1016/S2468-2667(20)30090-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Robert Koch-Institut (RKI). Fallzahlen in Deutschland (COVID-19); 2020, accessed 2020-07-12. Available from: https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Fallzahlen.html.
  • 49.Google LLC. Google COVID-19 Community Mobility Reports; 2020, accessed 2020-06-25. Available from: https://www.google.com/covid19/mobility/.
  • 50.Deutscher Wetterdienst (DWD) Climate Data Center (CDC). Recent daily station observations (temperature, pressure, precipitation, sunshine duration, etc.) for Germany, quality control not completed yet, version recent; 2020, accessed 2020-07-12. Available from: https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/daily/kl/recent/.
  • 51.Bundesinstitut für Bau-, Stadt- und Raumforschung (BBSR). INKAR—Indikatoren und Karten zur Raum- und Stadtentwicklung; 2020, accessed 2020-06-25. Available from: https://www.inkar.de/.
  • 52. Mitze T, Kosfeld R, Rode J, Wälde K. Face Masks Considerably Reduce COVID-19 Cases in Germany. Proceedings of the National Academy of Sciences. 2020;117(51):32293–32301. 10.1073/pnas.2015954117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Spirtes P, Glymour CN, Scheines R, Heckerman D. Causation, prediction, and search. MIT Press; 2000. [Google Scholar]
  • 54. Pearl J. Causality. Cambridge: Cambridge University Press; 2009. Available from: https://www.cambridge.org/core/books/causality/B0046844FAE10CBF274D4ACBDAEB5F5B. [Google Scholar]
  • 55. Greenland S, Pearl J, Robins JM. Causal Diagrams for Epidemiologic Research. Epidemiology. 1999;10(1):37–48. 10.1097/00001648-199901000-00008 [DOI] [PubMed] [Google Scholar]
  • 56.Henckel L, Perković E, Maathuis MH. Graphical Criteria for Efficient Total Effect Estimation via Adjustment in Causal Linear Models. arXiv e-prints. 2020; p. arXiv:1907.02435.
  • 57. R Core Team. R: A Language and Environment for Statistical Computing; 2019. Available from: https://www.R-project.org/. [Google Scholar]
  • 58. Kalisch M, Mächler M, Colombo D, Maathuis MH, Bühlmann P. Causal Inference Using Graphical Models with the R Package pcalg. Journal of Statistical Software. 2012;47(11):1–26. 10.18637/jss.v047.i11 [DOI] [Google Scholar]
  • 59. Hilbe JM, Greene WH. 4—Count Response Regression Models. In: Rao CR, Miller JP, Rao DC, editors. Essential Statistical Methods for Medical Statistics. Boston: North-Holland; 2011. p. 104–145. Available from: http://www.sciencedirect.com/science/article/pii/B9780444537379500074. [Google Scholar]
  • 60. Venables WN, Ripley BD. Modern Applied Statistics with S. 4th ed. New York: Springer; 2002. Available from: http://www.stats.ox.ac.uk/pub/MASS4. [Google Scholar]
  • 61. Hoerl AE, Kennard RW. Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics. 1970;12(1):55–67. 10.1080/00401706.1970.10488634 [DOI] [Google Scholar]
  • 62. Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software. 2010;33(1):1–22. 10.18637/jss.v033.i01 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Perković E, Textor J, Kalisch M, Maathuis MH. A Complete Generalized Adjustment Criterion. arXiv e-prints. 2015; p. arXiv:1507.01524.
  • 64. Gencoglu O, Gruber M. Causal Modeling of Twitter Activity During COVID-19. Computation. 2020;8(4). 10.3390/computation8040085 [DOI] [Google Scholar]
  • 65. Wachtler B, Michalski N, Nowossadeck E, Diercke M, Wahrendorf M, Santos-Hövener C, et al. Socioeconomic inequalities and COVID-19—A review of the current international literature. Journal of Health Monitoring. 2020;(S7):3–17. 10.25646/7059 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Hoebel J, Michalski N, Wachtler B, Diercke M, Neuhauser H, Wieler LH, et al. Sozioökonomische Unterschiede im Infektionsrisiko während der zweiten SARS-CoV-2-Welle in Deutschland. Dtsch Arztebl International. 2021;118(15):269–270. 10.3238/arztebl.m2021.0188 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Perković E, Textor J, Kalisch M, Maathuis MH. Complete graphical characterization and construction of adjustment sets in Markov equivalence classes of ancestral graphs. The Journal of Machine Learning Research. 2017;18(1):8132–8193. [Google Scholar]
  • 68. Islam N, Sharp SJ, Chowell G, Shabnam S, Kawachi I, Lacey B, et al. Physical distancing interventions and incidence of coronavirus disease 2019: natural experiment in 149 countries. BMJ. 2020;370. 10.1136/bmj.m2743 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Kermack WO, McKendrick AG. Contributions to the mathematical theory of epidemics–I. 1927. Bulletin of mathematical biology. 1991;53(1-2):33—55. 10.1007/bf02464423 [DOI] [PubMed] [Google Scholar]
  • 70.an der Heiden M, Buchholz U. Modellierung von Beispielszenarien der SARS-CoV-2-Epidemie 2020 in Deutschland. 2020; 10.25646/6571.2 [DOI]
  • 71. Schisterman EF, Perkins NJ, Mumford SL, Ahrens KA, Mitchell EM. Collinearity and Causal Diagrams: A Lesson on the Importance of Model Specification. Epidemiology. 2017;28(1). 10.1097/EDE.0000000000000554 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72. Openshaw S. Ecological Fallacies and the Analysis of Areal Census Data. Environment and Planning A: Economy and Space. 1984;16(1):17–31. 10.1068/a160017 [DOI] [PubMed] [Google Scholar]
  • 73.Solar O, Irwin A. A conceptual framework for action on the social determinants of health. WHO Document Production Services; 2010. Available from: https://drum.lib.umd.edu/handle/1903/23135.
  • 74. World Health Organization, et al. Transmission of SARS-CoV-2: implications for infection prevention precautions: Scientific Brief, 09 July 2020. World Health Organization; 2020. [Google Scholar]

Decision Letter 0

Sungwoo Lim

8 Feb 2021

PONE-D-20-23587

Causal analysis of COVID-19 observational data in German districts reveals effects of mobility, awareness, and temperature

PLOS ONE

Dear Dr. Steiger,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Mar 22 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Sungwoo Lim, DrPH

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This is an interesting manuscript that aims to identify "causal" relationships" for COVID-19 transmission based on a DAG analysis and empirical data from other studies for variable selection.

The DAG analysis is a multivariate analysis to select a subset of variables that are correlated from a large number of candidates. The first select the sets of variables that are associated with the outcome using DAG analysis and then put them into the regression. The DAG analysis returns different sets of variables, using different selection criterion ( for example, the most parsimonious set or the most variation explained ). So the authors assumedly use R-squared to select the final sets of variables in the regression model.

The challenge I have here (and having reviewed this as well with our statistical team) is that the claims of causal relationships are not convincing, even if they identify associative relationships that other studies have also found.

Specifically:

1) The model does not address the nonlinearity of variables (such as temperature); other studies have found that temperature has a u-shaped effect at low and high temperatures. Humidity effects may also be nonlinear. The assumption of non-linearity on continuous variables needs to be considered.

2) There are no tests for multi-collinearity that are presented, which presents significant concerns for over-fitting the model. Temperature and humidity are an example here. So is the mobility data.

2) Feels like they cherry-pick the variables of interest in the end? Not sure how they arrived at the final variable list for causal effects. Whey were the restrictions not included in that analysis?

3) The term “causal analysis” is a bit strong for what they have done here. The basis of the work is the proposed DAG (Figure 1), but the diagram was constructed from other association studies. Other than the DAG analysis, they did not do anything to ensure the results are “causal relationship”. So I am not convinced that the analysis or results are causal.How do you avoid over-fitting the model or including mediating variables in the analysis?

5) What about interaction terms? For example, residential mobility and colder weather? Or rain? These relationships are not simply a straightforward multivariable model.

6) The most interesting terms in their model were the interventions (restrictions) themselves, yet they were dropped from the model. No discussion is really pursued on that. Why were public health interventions removed?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 May 27;16(5):e0237277. doi: 10.1371/journal.pone.0237277.r002

Author response to Decision Letter 0


29 Mar 2021

Our responses to the reviewer's comments are in the pdf Reply to the Reviewer.

Fell free to contact us if you need additional information.

Attachment

Submitted filename: replyreviewer.pdf

Decision Letter 1

Sungwoo Lim

13 Apr 2021

PONE-D-20-23587R1

Causal graph analysis of COVID-19 observational data in German districts reveals effects of determining factors on reported case numbers

PLOS ONE

Dear Dr. Steiger,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by May 28 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Sungwoo Lim, DrPH

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: I find the topic of the paper very important and the methodology very interesting. The author has used a number of variables to analyse possible determinants of COVID-19 infections. While the paper acknowledges that there are social factors which affect the spread of the COVID-19 this is not properly discussed in the paper. I strongly suggest the author to refer to the literature on Social Determinants of Health when discussing social variables. For example see the recent paper by Galanis and Hanieh in Social Science and Medicine on Incorporating Social Determinants of Health into Modelling of COVID-19 and references therein.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Decision Letter 2

Sungwoo Lim

6 May 2021

Causal graph analysis of COVID-19 observational data in German districts reveals effects of determining factors on reported case numbers

PONE-D-20-23587R2

Dear Dr. Steiger,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Sungwoo Lim, DrPH

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Sungwoo Lim

17 May 2021

PONE-D-20-23587R2

Causal graph analysis of COVID-19 observational data in German districts reveals effects of determining factors on reported case numbers

Dear Dr. Steiger:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Sungwoo Lim

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: replyreviewer.pdf

    Attachment

    Submitted filename: replyreviewer.pdf

    Data Availability Statement

    All relevant data are available from public sources and are aggregated in the github repository pertaining to the manuscript: https://github.com/zidatalab/causalcovid19.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES