Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2021 Apr 3;134:104369. doi: 10.1016/j.compbiomed.2021.104369

Elementary effects analysis of factors controlling COVID-19 infections in computational simulation reveals the importance of social distancing and mask usage

Kelvin KF Li a,b,, Stephen A Jarvis c, Fayyaz Minhas a
PMCID: PMC8019252  PMID: 33915478

Abstract

COVID-19 was declared a pandemic by the World Health Organisation (WHO) on March 11th, 2020. With half of the world's countries in lockdown as of April due to this pandemic, monitoring and understanding the spread of the virus and infection rates and how these factors relate to behavioural and societal parameters is crucial for developing control strategies. This paper aims to investigate the effectiveness of masks, social distancing, lockdown and self-isolation for reducing the spread of SARS-CoV-2 infections. Our findings from an agent-based simulation modelling showed that whilst requiring a lockdown is widely believed to be the most efficient method to quickly reduce infection numbers, the practice of social distancing and the usage of surgical masks can potentially be more effective than requiring a lockdown. Our multivariate analysis of simulation results using the Morris Elementary Effects Method suggests that if a sufficient proportion of the population uses surgical masks and follows social distancing regulations, then SARS-CoV-2 infections can be controlled without requiring a lockdown.

Keywords: COVID-19, Agent-based modelling, Coronavirus, Simulation, SARS-COV-2, netlogo, Python, Epidemiology, Survival, Infectious diseases, VIRUS, Stochastic processes, Stochasticity, Social distancing, Masks, Isolation, Lockdown

1. Introduction

COVID-19, informally known as the coronavirus, is a respiratory illness that is caused by the Severe Acute Respiratory Syndrome (SARS) Coronavirus 2 (SARS-CoV-2) [1]. It was declared a pandemic by the World Health Organisation (WHO) on March 11th, 2020. As large-scale distribution of vaccines for this virus can only be expected by late-2021 [2], controlling COVID-19's infection and fatality rates over the next year will be critically dependent on the successful implementation of public health measures, such as social distancing, isolation, quarantine and usage of masks [3].

It has almost become common knowledge that isolating symptomatic cases, enforcing a lockdown, social distancing and the usage of masks are amongst the most effective methods of reducing the spread of viral infections. Whilst there are many univariate analyses evaluating the impact of different preventatives measures such as cloth masks vs. surgical masks in reducing COVID-19 infections [4,5] and the effectiveness of various social distancing guidelines [6,7], there is a lack of literature that reviews the effectiveness of these preventative measures in combination and relative to each other. For example, is social distancing more effective than the isolation of symptomatic cases? Is a lockdown the most effective option to reduce infection numbers? How useful are masks relative to the aforementioned strategies?

Traditionally, statistical and mathematical models, such as the Susceptible-Infected-Recovered (SIR) model, have been used to simulate and predict the trends of pandemics. Unfortunately, they are mostly only useful at establishing the basic principles because of challenges such as numerical tractability and different research agendas. These models are usually oversimplified and fail to take into account human behaviour, human contact patterns, population diversity and variations of human hosts. This paper aims to answer the above questions via an agent-based modelling approach, which can efficiently capture emergent phenomena and inter-agent interactions, potentially providing a more accurate answer of the uncertainties that cannot be answered by traditional mathematical models.

2. Methods

The following section describes the methods used to investigate the effectiveness of different nonpharmaceutical interventions (NPIs) such as masks, social distancing, lockdown and self-isolation for reducing the spread of COVID-19 through the agent-based simulation framework developed in this work.

2.1. Agent-based simulation

In this work, we have developed an agent-based simulation based on the Susceptible-Infected-Recovered-Dead (SIRD) model by Li et al. [8].

Agent-based modelling (ABM) is a method of simulating the behaviour and interactions of autonomous agents in a particular environment across time steps. Autonomous agents can be thought of as individual entities that carry out some set of operations on behalf of a user but without the interference of that ownership entity [9]. Agents within the ABM are heterogeneous, meaning that each agent possesses a unique set of characteristics, including age group, gender, infection status, etc., that determine the agent's behaviour and how it interacts with other agents in the environment [10].

Agent-based models serve as an effective decision support tool to analyze the local and global impact of various factors associated with a pandemic and to test the effectiveness of various interventions and control strategies [11]. They have been used to test combinations of individual-level (e.g., usage of masks) and mass-action strategies (e.g., lockdown) in different countries [12], as well as in smaller urban or rural areas [13].

Since the 1990s, as more agent-based simulation packages and platforms such as Swarm, NetLogo, RePast and AnyLogic were released [14], agent-based modelling became increasingly used in biology and epidemiology [15].

In this section, we present the operational scheme underlying the proposed simulation. The interested reader is also referred to our NetLogo [16] codebase at the URL: https://github.com/kelvin1338/NetLogo.

The model used in this paper consists of N=10000 individuals inside a 100×100 patch grid which models a set of C=16 cities as grid blocks. As increasing the population size will increase the model runtime, we chose a population size of 10000 because it is sufficiently large to provide good statistical robustness as the variances of the measured effects at the output of the model are fairly small across different simulation runs. As shown in Fig. 1 , at a parametric level, the simulation can be divided into three types of parameters: control variables, simulation structure parameters and target variables. The simulation control variables constitute the set of independent variables such as mask usage, social distancing, lockdown delay, case isolation/quarantine, etc., that can be changed to study their impact on target variables such as peak infection rate, number of hospitalizations and deaths, etc. Simulation structure parameters control how the simulation is run and includes factors such as the maximum number of time steps, mask effectiveness, symptomatic death rate, healthcare system capacity, etc. These variable settings determine the states of different agents in the simulation, thus controlling their behaviour.

Fig. 1.

Fig. 1

Concept diagram of the Agent-based modelling framework, based on 3 types of parameters - Control Parameters, Simulation Structure Parameters and Target Variables.

Initially, all individuals in the simulation are healthy except for the preset proportion of asymptomatic infected individuals. As these individuals move around, virus transmission to uninfected individuals occurs. Over time, the severity level of infected individuals can increase in a probabilistic manner which can lead to hospitalization or death. Individuals with serious symptoms can die based on a probability and the remaining symptomatic cases can recover.

Below, we discuss different aspects of the agent-based simulation.

2.1.1. States of individuals

Each individual in the simulation model has an associated set of state variables: health (healthy, asymptomatic, light-symptomatic, serious-symptomatic, recovered and dead), mask usage (yes or no), quarantine status (upon developing symptoms), and hospitalization status (dependent on healthcare load and availability). The behaviour of the individual agent changes based on their state, which changes over time.

2.1.2. Virus transmission and effect modelling

The simulation models the average infection duration, maximum infectious distance, age-wise disease severity and fatality rates, as well as the average number of days before asymptomatic individuals begin to be symptomatic. These factors affect how infections are transmitted in the population and can be changed by the user. However, they are determined based on published sources, as presented in Table 1 , and are kept fixed in this analysis.

Table 1.

Control variables.

Parameter Value Source/Justification
Population size 10000 Relatively large sample to reduce anomalies
Population
Distribution
1000 people per age group An additional 1000 people has been allocated to the middle 40–49 age group so that the total population is 10000 for the ease of comparison and interpretation.
Size of simulation
space
100×100 patches Not too spacious or densely populated and kept constant for the ease of comparison. Adjusted by setting max-pxcor and max-pycor to 100.
Metres per Patch 40 m Arbitrary value for the ease of comparison
Total infection duration 21 days Median infection duration for COVID-19 is 20.8 days (Bi et al. [18])
Asymptomatic period 6 days Average asymptomatic period is 6 days (World Health Organisation. [19])
Maximum infectious distance 2 m Rough estimation based on advice from WHO and a study by Setti et al. [20]
Infectivity 100 Infectivity should always be arbitrarily set to 100 for the ease of comparison
Weather
Conditions
Cold + Dry Arbitrary value for ease of comparison
Mask usage*
Lockdown delay*
% Ignore lockdown
Social distancing*
Healthcare
0 All safety measures are disabled.
*Only enabled if the parameter of concern is being investigated by the sensitivity analysis experiment
Enable lockdown*
City confinement
False All safety measures are disabled.
*Only enabled if the parameter of concern is being investigated by the sensitivity analysis experiment

2.1.3. Healthcare

The simulation models a healthcare system where hospitalized individuals have reduced death rates and decreased recovery times. Individuals with severe symptoms are hospitalized based on a uniform probability distribution if the healthcare system has capacity.

2.1.4. Modelling movement and lockdown

In order to model the movement of individuals, all alive individuals who feel no symptoms can move randomly up to 0.2 units distance per tick. The user can initiate a lockdown at any time during the simulation, where everyone will stop moving except for a proportion of non-compliant individuals, which can also be specified by the user. A city-level lockdown can also be implemented which confines individuals to move only inside their own city.

2.1.5. Modelling mask usage

A clinical experiment conducted by MacIntyre et al. over 1607 people showed that particle penetration rate was 97% through a cloth mask and 44% through medical masks [17]. The experiments in this paper assume the use of surgical masks and hence the mask penetration rate used in the model is 44%. However, this can be adjusted by the user, as well as the number of mask users. These mask users have their probability of getting infected and the probability of infecting others is reduced down to (mask penetration rate)%, as the remaining virus particles are ‘filtered’ by their mask.

If an infected mask user meets an uninfected mask user, then the virus particles are modelled as having to travel through two layers of masks; one from the infected person, and one from the uninfected person. Hence, in this particular situation, the probability of infecting others is reduced down to (mask penetration rate%)2.

2.2. Simulation parameters

A breakdown of different model parameters are provided in Table 1.

2.2.1. Serious symptomatic and death rates

The rate of an infected individual developing serious symptoms was obtained by adjusting the hospitalization rates from a study by Verity et al. [21] to match expected rates in the highest age group. This is shown in the second column of Table 2 .

Table 2.

Symptomatic and death rates.

Age group Serious symptoms rate (%) [SR] [21,22] Real infection fatality ratio (%) [IFR] [21,22] Model Fatality rate (%) [100×IFR/SR]
0–9 0.1 0.002 2.00
10–19 0.3 0.006 2.00
20–29 1.2 0.03 2.50
30–39 3.2 0.08 2.50
40–49 4.9 0.15 3.06
50–59 10.2 0.60 5.88
60–69 16.6 2.2 13.25
70–79 24.3 5.1 20.99
80+ 27.3 9.3 34.07

As the simulation model assumes that only individuals with serious symptoms can die, we calculated the adjusted fatality rate for each age group in our simulation based on real-life infection fatality ratio (IFR) in the 3rd column of Table 2, which was also obtained from the study by Verity et al. [21]. The aforementioned adjusted fatality rate is presented in the 4th column of Table 2, and is simply the third column divided by the second column, multiplied by 100.

2.2.2. Ranges of analysis variables

Recall that in this research paper, the four main independent variables of interest are mask usage, social distancing, lockdown and isolation of symptomatic cases.

For mask usage and isolation rate, the minimum and maximum values are the cases where nobody follows the rule, and everybody follows the rule, respectively. Hence, 0 is the minimum, and 100 is the maximum.

For social distancing, the minimum value is the case where nobody follows social distancing rules, which is 0. The maximum value was chosen to be 2.5 because realistically, most countries enforce social distancing up to 2 m. Also, the maximum radius of infection is estimated to be approximately 2 m.

For lockdown delay, the minimum bound was chosen to be 7 days because it would be unrealistic for authorities to enforce a lockdown immediately from the moment the first person in the country catches the infection. The maximum bound was chosen to be 32 days because from the visual analysis and univariate analysis, delaying the lockdown beyond 20–24 days has a negligible effect on the peak of the infection curve.

2.2.3. Target variables

From each simulation run, the number of infections is monitored. As the simulation involves stochasticity, the peak percentage of infections across multiple simulation runs for a given parameter setting is used as a simple and interpretable variable of interest.

For each configuration of parameters, a more in-depth analysis can be provided, allowing the % of the population infected at any given time to be recorded, as well as other more specific target variables such as the asymptomatic, light-symptomatic, serious-symptomatic, recovered and dead populations at any point in time.

2.3. Sensitivity analysis

We are interested in studying the impact of each of the independent variables on the target variables, as long as the virus-related simulation parameters are kept fixed. For this purpose, we use univariate analysis and the method of elementary effects as discussed below.

2.3.1. Univariate analysis

In univariate sensitivity analysis, a particular parameter is adjusted, whilst the remaining parameters remain constant. The peak % of daily infections corresponding to the maximum value in the infection curve is recorded for each run. This was conducted for each of the four independent variables in increments of 20% from the absolute minimum to the absolute maximum. To ensure statistical robustness in our results, the simulation is run 12 times for each increment of the independent variable. The results are then analysed and plotted on a scatter graph.

The sensitivity index of each parameter's median peak infections were calculated using the corresponding formula below and was compared to the other parameters.

sensitivity=YmaxYminYmax (1)

here, Ymax and Ymin correspond to the maximum and minimum total infections respectively, for each set of parameter settings.

Sensitivity index was chosen because it is an easily interpretable metric that quantifies the impact of an NPI on infections across all simulation runs. As each set of parameter configurations is run 12 times and 5 increments are analysed for each of the four NPIs, there are 60 simulation runs for each NPI and 240 simulation runs in total.

An experiment is run for each of the four independent parameters according to the range specified in Table 3 .

Table 3.

Boundaries of independent variables.


Parameter
Social distancing metres Mask usage rate Lockdown delay Symptomatic Isolation rate
Maximum 2.5 100 32 100
Minimum 0 0 7 0

2.3.2. Elementary effects method

Whilst the previous univariate sensitivity analysis can provide insights regarding the strength and effect of each individual variable, it fails to capture multivariate correlation. The following section provides a multivariate analysis of the four main preventative measures variables - mask usage, social distancing, symptomatic isolation and lockdown delay, and how they interact with each other via Morris Elementary Effects Method (EEM) [23], which is an adaptation of the One-At-a-Time (OAT) design approach that identifies the most important variables in determining the output using a small number of simulations.

Given a process Y=g(x1,,xk) dependent upon k factors with x[0,1], r trajectories are generated by EEM, which is divided into a k-dimensional grid consisting of ρ levels of equal size. A trajectory begins with randomly selected base values x*=[x1*,,xk*] in the ρ-level grid. This base vector is used to generate k different parameter vectors for a particular trajectory. For this study the EEM trajectories were generated by the SALib library [24]. The first parameter vector x(1) is generated by adding or subtracting Δ to one of the parameters. Similarly, x(2) is obtained by adding or subtracting Δ to another parameter. The parameters defined here are used in equations (2), (3), (4), (5), (6)) below. After obtaining the values for each parameter vector in the trajectory, the elementary effect of a parameter can then be obtained using the formula below:

EEi=Y(x1,,xi+Δ,,xk)Y(x1,,xi,,xk)Δ (2)

After the elementary effects were obtained, the corresponding mean, absolute-mean and variance of the elementary effect of a variable was calculated using the three formulas below respectively:

μi=1rj=1rEEij (3)
μi*=1rj=1r|EEij| (4)
σi=1r1j=1r(EEijμi)2 (5)

The reason μi* is introduced and uses absolute values is to prevent certain elementary effects from cancelling out due to opposite signs. Note that a high value of μi* implies that the output is very sensitive to the value of parameter i. A large value of σ indicates that there is an interaction between this parameter and other parameters and that the parameter is interconnected to the values of other parameters.

After calculating the corresponding values for a particular variable, its ‘rank’ can be calculated using the following formula, which determines the strength of the parameter's effect relative to other parameters:

Ranki=μi*2+σi (6)

Finally, the results of the elementary effects analysis are compared to the real-life trends of COVID-19 in the United Kingdom, Hong Kong and Italy, which was chosen because of their contrasting ways of dealing with COVID-19. Population and epidemic-response parameters for these countries (see Table 4 ) have been gathered through background research based on various medical and news sources, as well as interviews and surveys.

Table 4.

Summarised background research of UK, Hong Kong and Italy.

United Kingdom Hong Kong Italy
Date of first case 23rd January [25] 22nd January [26] 31st January [27]
Median age 40.5 [28] 44.8 [29] 47.3 [30]
Proportion of people aged 65+ 18.48% [28] 18.48% [29] 22.08% [30]
Population 67.78 million [31] 7.48 million [31] 60.48 million [31]
Population Density (People/km2) 281 [31] 7140 [31] 206 [31]
Urban population 83% [31] 100% [31] 69.5% [31]
Rural population 17% [31] 0% [31] 30.5% [31]
Previous experience with similar outbreaks? No Yes (SARS 2003) [32] No
% reported to follow lockdown rules 89% [33] N/A N/A
Lockdown implemented? Yes [34] No Yes [35]
14-day quarantine implemented? Yes (Not properly enforced) Yes (Strict) [36] No
Usage of Masks Rare Strictly followed from the start Mandatory after some time
Main source of healthcare NHS [37] Public + Private healthcare SSN [38]
Healthcare free for citizens? Yes [37] No [39] Yes [38]
% of citizens who can afford healthcare 100% [37] 92% [39] 100% [38]
Total doctors 280000 [40] 14290 [41] 427213 [42]
Doctors per 1000 people 2.8 [40] 1.9 [41] 4 [43]
Total nurses 661000 [44] 56723 [45] 418461 [46]
Total nurses per 1000 people 8.17 [44] 7.3 [45] 5.74 [46]
Hospital beds per 1000 people 6.6 [47] 7.1 [48] 3.4 [46]
2.3.2.1. Implementation of EEM

The values for each of the independent variables were standardised to a range between 0 and 1. The value of Δ was chosen to be 0.2, which meant that each parameter can take 6 unique standardised values - 0, 0.2, 0.4, 0.6, 0.8 or 1.0. As there are four parameters, it follows that there are 64=1296 possible permutations. Using BehaviorSpace, 1296 simulations were run, once for each permutation. The peak infection % was recorded for each run.

30 arbitrary trajectories were generated with SALib, a Python library that contains a Morris-Elementary-Effect toolkit. From the definition of EEM, this meant that there are 5 simulations associated with each trajectory, meaning 5×30=150 different simulations were required. Recall earlier that the peak infection % for all 1296 permutations are available. Hence, for each permutation, the peak % of infections was extracted from the list of 1296 runs and mapped accordingly.

Now that there is a list of trajectories with the corresponding outcome, the final step was to calculate μ,μ*,σ and the rank for each parameter. Using the ‘analyze’ function in SALib, the μ,μ* and σ was computed, and the rank was manually calculated afterwards. The results are presented later in section 3.2.

2.4. Implementation

The interactive agent-based simulation model has been uploaded to GitHub along with detailed usage instructions, which can be found on the following link: https://github.com/kelvin1338/NetLogo.

3. Results

The following section provides a breakdown of the results of all experiments proposed in section 2.

3.1. Univariate analysis results

The sensitivity indices obtained from the simulation for each of the four independent variables are given in Table 5 . It can be observed that social distancing has a very high sensitivity index, whilst the mask usage rate and lockdown delay both have relatively high and similar sensitivity index. Isolating symptomatic cases results in the lowest sensitivity index. It was also observed that a lockdown reduces the infection numbers to 0 in the shortest amount of time, whilst social distancing and mask usage are relatively slower. Both of these measures flatten the curve as the respective measure is followed by more people, although it takes longer for the cases to reach 0 again. Purely isolating symptomatic cases reduces the peak, although there is little impact on the distribution of the infection curve. Hence, this has the last significant effect out of the four safety measures.

Table 5.

Sensitivity index of the four dependent variables.


Parameter (i)
Social distancing metres Mask usage rate Lockdown delay Symptomatic Isolation rate
Sensitivity index 0.784 0.692 0.683 0.238

3.1.1. Social distancing

Social distancing has a sensitivity index of 0.784 based on the results presented in Table 6 which is the highest out of the four parameters. From Fig. 2 a, it can be seen that there is a significant decrease in the severity of infections as soon as social distancing is enforced. As the distance threshold of social distancing is increased, the severity of infections decreases, following a negative logarithmic trend. Fig. 2b shows the infection curve corresponding to the number of active cases at a given time. As the social distancing metres increases, both the peak number of active cases and their rate are decreased resulting in flattening of the infection the curve [49].

Table 6.

Summary statistics for social distancing, mask usage, lockdown delay and symptomatic case isolation, measuring peak % of active cases.


Social Distancing (metres)
Mask Usage rate (%)
Lockdown delay (days)
Symptomatic Case Isolation (%)
0 0.5 1 1.5 2 2.5 0 20 40 60 80 100 7 12 17 22 27 32 0 20 40 60 80 100
Median 45.38 26.94 26.09 18.38 12.75 9.79 45.59 33.10 25.87 19.20 14.06 9.68 14.45 25.09 32.54 40.53 42.12 45.52 44.13 40.53 41.34 39.29 36.76 33.64
Mean 44.27 26.91 25.69 18.53 13.08 10.06 44.63 33.87 25.61 18.95 14.09 9.85 14.29 25.03 32.33 40.90 41.97 44.63 43.19 41.39 41.76 38.96 37.48 33.77
Range 6.69 8.03 5.30 3.54 4.45 3.22 9.38 6.72 7.18 3.77 3.79 3.35 4.51 4.94 8.29 8.76 8.62 11.30 8.63 12.40 13.18 11.19 14.12 9.13
Variance 4.98 4.49 3.10 2.32 1.78 1.12 8.16 5.52 3.36 1.69 1.38 0.97 1.52 2.30 5.85 6.65 4.43 11.62 8.16 11.75 12.34 11.91 16.16 8.44
Standard Deviation 2.23 2.12 1.76 1.52 1.33 1.06 2.86 2.35 1.83 1.30 1.18 0.98 1.23 1.52 2.42 2.58 2.10 3.41 2.86 3.43 3.51 3.45 4.02 2.91
Standard Error 0.19 0.18 0.15 0.13 0.11 0.09 0.24 0.20 0.15 0.11 0.10 0.08 0.10 0.13 0.20 0.21 0.18 0.28 0.24 0.29 0.29 0.29 0.33 0.24
Fig. 2.

Fig. 2

Univariate analysis results of social distancing and mask usage.

3.1.2. Mask usage rate

Mask usage has a sensitivity index of 0.692 based on the results presented in Table 5 which is almost identical to lockdown delay but less than social distancing. From Fig. 2c, it was noticed that as the mask usage rate increased from 0 to 100, the severity of the virus decreases with a negative-logarithmic trend, although it is more linear compared to social distancing. Fig. 2d shows that as mask usage increases, the peak of the curve flattens whilst taking longer to reach 0 cases again for all configurations.

3.1.3. Lockdown delay

Lockdown delay has a sensitivity index of 0.683, based on the results from Table 6, which indicates that it appears to have almost as much effect as mask usage but less than social distancing. From Fig. 3 a, the severity of the virus increased almost linearly and very steeply as the lockdown delay was increased until a certain point, which highlights the importance of enforcing a lockdown early before the cases become out of control. In addition, it can be seen from Fig. 3b that enforcing a lockdown after the peak infection has been reached has very little effect on the severity of infections.

Fig. 3.

Fig. 3

Univariate analysis results of lockdown delay and symptomatic case isolation.

3.1.4. Symptomatic case isolation rate

Symptomatic case isolation has a sensitivity index of 0.238, based on the results from Table 6, which is the lowest of the four parameters, meaning it has the least effect out of the four safety measures. From Fig. 3c, it was observed that using this parameter alone will result in a high variance for all settings compared to the other three preventative measures, suggesting isolating symptomatic cases alone will not consistently contain the virus. Fig. 3d shows that isolating cases alone reduces the peak of the infection curve in a linear manner. However, it is relatively less effective in flattening the curve as it takes a long time to reach 0 cases again.

3.2. Elementary effects analysis results

The results of the elementary effects analysis are shown in Table 7 . From the results in this table, social distancing is the most effective measure in reducing the peak of the infection curve, followed by mask usage, lockdown delay and finally isolation of symptomatic cases. Social distancing has the highest value of μi*, meaning the peak of the infection curve is very sensitive to the measure. It also has the highest value of σ, meaning that there is a large interaction between this parameter and the other three parameters.

Table 7.

Results of elementary effects for 30 trajectories, p=6 and.Δ=0.2

Parameter (i)
Social distancing metres Mask usage rate Lockdown delay Symptomatic Isolation rate
μi 4.275 4.039 2.014 1.377
μi −3.981 −4.039 1.641 −0.777
σi 5.255 3.246 2.881 1.667
Ranki 4.850 4.422 2.634 1.888

As seen from the results table, enforcing a lockdown has a relatively lower σ and μi* compared to masks and social distancing, indicating that it has less effect on the peak of the infection curve than masks and social distancing regardless of parameter interactions. Whilst the univariate analysis earlier showed that a lockdown is the most effective measure at reducing the peak of the infection curve as quickly as possible, this experiment showed that a lockdown may not be necessary in the first place, as there would potentially be no significant infection peak initially if social distancing is followed and masks are used by everyone.

3.3. Relationship of EEM results with trends in different countries

In order to understand the correspondence of simulation results with real-world data, we have studied the trends of viral transmission across different countries. Specifically, we obtained the COVID-19 timelines, the main preventative measures and infection numbers over time for the United Kingdom, Hong Kong and Italy. We then compared these trends and numbers to the results in our Elementary Effects results in Table 7.

The experiment results show the importance of surgical mask usage; and strongly supports Hong Kong's strategy in successfully containing the virus which involved a strong emphasis on mask usage and social distancing from the very beginning [50] whilst not enforcing any total lockdowns. For example, some restaurants, entertainment venues and sports centres were closed and the majority of the population began wearing masks before the virus even reached Hong Kong.

The results from the EEM shows that the symptomatic isolation rate is the least effective safety measure out of the four variables, which can be explained by COVID-19's long asymptomatic period. This estimation is fairly relatable with the United Kingdom and Italy, who were heavily focused on the isolation of symptomatic cases and were much slower than Hong Kong to enforce any other safety measures [51]. For example, it took the United Kingdom longer than a month before restaurants, pubs, schools and entertainment venues were shut, and several more months before people started wearing masks. This lead to the infection curve rising to drastically higher values compared to Hong Kong as seen in Fig. 4 , which compares the percentage of the overall population infected daily between the three countries. It can be seen that the cases in the UK and Italy were exponentially higher than in Hong Kong, and they were left with no choice but to implement a lockdown, which is forecasted by the EEM and univariate analysis results.

Fig. 4.

Fig. 4

Daily COVID-19 cases in the United Kingdom, Hong Kong and Italy [52].

3.4. Comparison of model infection trends with historical trends

We conducted a further investigation to test the validity of our model by comparing the trend of the infection numbers to real-world data. For this simulation, Hong Kong, Italy and the United Kingdom have been chosen due to their unique infection trends. The model simulation structure parameters were adjusted accordingly to resemble the population of the simulated country, whilst the control parameters are adjusted during the simulation, at different points, to loosely follow the historical timeline and actions taken by the countries. A simulation was run for each country up to the end of the first major wave of infections and the results are shown in the following subsections.

Note that in real-life, many asymptomatic cases are not recorded, and hence the ‘Symptomatic’ blue curve in Fig. 5, Fig. 7, Fig. 9 follow the reported historical figures most closely, as symptomatic cases are most likely detected and reported in real-life. The other curves provide a more detailed analysis and help us estimate the true number of cases.

Fig. 5.

Fig. 5

Model results for Hong Kong.

Fig. 7.

Fig. 7

Model results for Italy.

Fig. 9.

Fig. 9

Model results for the UK.

In the following sections, we discuss the results for Hong Kong, Italy and the United Kingdom. For each country, the correlation between normalised time series of the predicted and actual number of cases are calculated, together with the corresponding p-values. Note that as the model results were measured hourly and historical results were measured daily, the model results were randomly sampled so that it is the same length as the historical data. This allows us to compute correlation coefficients.

3.4.1. Hong Kong

The age distribution used for the simulation of Hong Kong is provided in Table 8 , calculated from data provided by the United Nations [53].

Table 8.

Age structure of Hong Kong's population [53].

Age Group % of Population
0–9 0.0859
10–19 0.0746
20–29 0.1203
30–39 0.1512
40–49 0.1514
50–59 0.1648
60–69 0.1358
70–79 0.0658
80+ 0.0501

As the first two waves of infections in Hong Kong contained very few infections, the simulation is run until the end of the first major spike of infections, known as the ‘third wave’, so that sufficient data can be compared.

3.4.1.1. Validation setup

Having experienced the SARS outbreak in 2003, Hong Kong took a more cautious approach compared to the UK and Italy in their COVID-19 strategy since the beginning [54]. Hence this model assumes 1.5 m of social distancing and 99% mask usage from day 0.

The model is set up to simulate Hong Kong from the middle of March, just as the first spike in cases was about to begin. Hong Kong imposed strict border restrictions very soon after the initial case [36]. Hence, this model assumes that it took just 7 days to tighten border controls. To model imported COVID-19 cases, 2 out of 10000 people are randomly infected every week from day 7.

Very soon after tightening borders, Hong Kong began enforcing a curfew, closing down many big tourist attractions and closing down the high-speed rail to China [55,56]. Hence, after another week, the lockdown parameter was enabled from day 14 with a relatively high number of intra-city travellers and essential workers, resembling a semi-lockdown.

After three months, Hong Kong began easing its border control by giving quarantine exemptions to people who met certain conditions [57], and this is modelled in the simulation by increasing the number of randomly infected people from 2 per week to 10 per week after day 120. Simultaneously, Hong Kong began gradually easing its curfew by partially reopening public facilities [58,59] and increasing the 8 people rule to 50. This is modelled in the simulation by disabling the lockdown and social distancing on day 126.

After approximately three weeks, the government tightened their borders again, including banning flights from India [60], which is replicated by reducing the randomly selected weekly infections from 10 to 2 again. Also, a semi-lockdown was enforced again, which is replicated in the model by re-enabling social distancing and total lockdown.

3.4.1.2. Validation results

Initially, the model infection numbers began rising at a relatively fast pace, but the tightening of borders and semi-lockdown were effective and infection numbers came to a halt at around day 20 and cases started to reduce again and were almost back to zero by approximately day 50, resembling the first wave and second wave of COVID-19 in Hong Kong, compared to the real-world data figure as seen in Fig. 6.

Fig. 6.

Fig. 6

Actual active cases in Hong Kong [61].

From day 120, the increase of imported case, reduction of social distancing and the ending of partial lockdowns caused the numbers to rapidly spike to approximately twice as high as the initial spike, which also resembles the real-world data.

From day 141, the tightening of borders, increase of social distancing and a second semi-lockdown quickly dampened the number of infections again. Overall, the model produced an infection curve strongly resembling historical data in Hong Kong, as seen by the simulation plot in Fig. 5 and the real-world data plot in Fig. 6. The raw number of infections was also significantly smaller than the infections in the UK and Italy, strongly resembling the real-world trend.

A comparison between the sampled normalised model results and real-world data of Hong Kong showed that the Pearson correlation coefficient is 0.40 with a corresponding p-value of 3.65×109, and the Spearman's rank correlation coefficient is 0.81 with a corresponding p-value of 1.31×1048. Both correlation coefficients are fairly large and the p-value is extremely small, which is strong evidence of correlation. This suggests that the model can produce a fairly accurate replica of the pandemic in Hong Kong and successfully captured the trend. Any difference is possibly due to a change in the difference between model parameters and real-world parameters.

3.4.2. Italy

The age distribution used for the simulation of Italy is provided in Table 9 , calculated from data provided by the United Nations [53].

Table 9.

Age structure of Italy's population [53].

Age Group % of Population
0–9 0.0843
10–19 0.0948
20–29 0.1013
30–39 0.1173
40–49 0.1524
50–59 0.1561
60–69 0.1221
70–79 0.0980
80+ 0.0738
3.4.2.1. Validation setup

The simulation starts from the moment Italy received their initial case. Italy almost immediately stopped all the direct flights from China, whilst keeping all other flights open. Hence, the initial random number of infections per week is set to 25. Not many people are initially aware of the virus and many thought it was a common cold until awareness was raised via media [62]. Hence, the initial symptomatic isolation rate is set at 30%, and later raised to 60% in the 3rd day, and 90% on the seventh day after people became increasingly aware of the virus.

As the virus quickly became out of control, Italy began closing some cities [63], which is modelled by enabling city confinement on the 7th day. Schools also closed soon afterwards [63], which increases social distancing. Hence, this is modelled by increasing the social distancing metres to 0.5, also on the 7th day. Note that although cities are confined, essential workers can still travel between cities.

The government then began strictly enforcing 1 m of social distancing [64], which is modelled by increasing the social distancing metres to 1 on the 10th day.

Although the rate of infections has reduced, the curve was still rising, and Italy further increased their safety measures by enforcing mobility restrictions, followed by a total lockdown [63]. This is implemented in the model by increasing the social distancing metres to 1.5 on day 12 and enabling total lockdown as well as decreasing the random infections to 6 per week on day 15 respectively.

This gradually stopped the increase of cases, and the curve quickly dropped, similar to the real data. After an extended period of lockdown, the government began reopening public facilities and relaxing the lockdown rules [65], which is modelled by ending the total lockdown and reducing the social distancing threshold to 1.0 m. This caused the rate of reduction to slow down. Eventually, Italy partially reopened its borders to some neighbouring European countries [66], which is modelled in the simulation by increasing the random weekly infection to 15 per week.

3.4.2.2. Validation results

With minimal NPI's enforced, the infection numbers began rapidly rising for the first week until social distancing measures and confinement were introduced. However, from Fig. 7, it can be seen that the cases were still rapidly increasing, which indicated that the preventative measures were not sufficient.

It was not until the enforcement of a full lockdown on day 15 where the infections stopped increasing, and the curve quickly dropped, which resembled the historical data, as ween in Fig. 8.

Fig. 8.

Fig. 8

Actual active cases in Italy [67].

However, although the infection number was dropping, it was still far from zero when the public facilities partially reopened on day 42 of the model simulation. After the public facilities reopened, the infection numbers still decreased, but at a slower rate. After the reopening of borders, the infection curve stopped decreasing and began slowly rising again, which accurately reflected the historical trend where the infection numbers began rising towards the end of August, as seen in Fig. 8. Based on these model results, it indicated that a new wave may be potentially coming, which was correct, as a second wave soon arrived, according to historical data [52]. Overall, there was a large resemblance between the model results and the historical data, as seen in Fig. 7, Fig. 8.

A comparison between the sampled normalised model results and real-world data of Italy showed that the Pearson correlation coefficient is 0.66 with a corresponding p-value of 7.89×1028, and the Spearman's rank correlation coefficient is 0.73 with a corresponding p-value of 4.38×1036. These correlation coefficients are both relatively high and the p-values are very close to zero, which is evidence that the model can produce a reasonably accurate replica of the pandemic in Italy and successfully captured the trend.

3.4.3. United Kingdom

The age distribution used for the simulation of the UK is provided in Table 10 , calculated from data provided by the United Nations [53].

Table 10.

Age structure of UK's population [53].

Age Group % of Population
0–9 0.1194
10–19 0.1121
20–29 0.1278
30–39 0.1363
40–49 0.1277
50–59 0.1353
60–69 0.1067
70–79 0.0840
80+ 0.0506
3.4.3.1. Validation setup

Initially, the United Kingdom approached the pandemic with the herd immunity strategy [68], where minimal countermeasures were used except for raising awareness of the virus and encouraging people with symptoms to self-isolate at home. Hence, the experiment was set up with no social distancing as well as a mask usage rate of only 5% [69]. Initially, there was very little awareness of the virus and many people still went outside despite feeling symptoms, thinking it was a common flu. Hence the initial symptomatic isolation rate was set to 30%. The UK airport was fully open during the start of COVID-19, hence to model imported COVID-19 cases, 40 out of 10000 people are randomly infected every week initially.

For the first 9 days, no significant changes were made in the simulation model except that the symptomatic isolation rate was increased to 60% on day 3, and then to 90% on day 7, as people become more aware of COVID-19.

Eventually, the government abandoned the herd immunity strategy and began partially closing public facilities and encouraging 2 m of social distancing. To model this in the simulation, the social distancing metres was increased to 2.

The UK then went into full lockdown and further tightened its borders, which was modelled in the simulation by enforcing a total lockdown and reducing the number of random infections from 40 per week to 6, from day 15. Approximately 2 weeks after the peak [70], the UK began partially easing its lockdown rules and more people began going outside. This is modelled by decreasing the social distancing metres to 1.5 from day 28. Border controls were eventually more lenient as neighbouring countries saw a drop in COVID-19 cases, and this is modelled by increasing the random infections from 6 to 18, from day 38. Eventually, the UK dropped the lockdown rule and reopened most of its facilities, which is modelled by decreasing the social distancing metres to 0.5 and disabling the total lockdown.

As the UK stopped releasing daily recovered cases data after April 13th, 2020 [52], the active cases could not be calculated and as a result, the active cases could not be plotted. Hence, the validation numbers are compared to the number of real daily cases, which also possesses a similar distribution.

3.4.3.2. Validation results

As a result of the herd immunity strategy, the number of cases rapidly increased for the first 9 days. After the UK changed the strategy and increased social distancing and put closed facilities. This had an effect on slowing down the rate of increase of infections, as visualised by the green curve in Fig. 9, followed by the blue and orange curve due to the symptomatic delay. However, it can be seen that the curve was still increasing, indicating that these NPI's were not strong enough to contain the virus.

It was not until the enforcement of a full lockdown and tightening of border control where the active cases stopped increasing. The number of active cases then began to slowly decrease after day 22.

The relaxing of lockdown rules on day 28 cause the rate of decrease of cases to slow down and the relaxing of border control further reduced the rate decrease. Eventually, the partial reopening of facilities caused the number of infections to gradually rise again, resembling the historical trend in Fig. 4, suggesting a new wave is approaching, which is correct according to historical data [52]. The overall distribution of the model infection curve resembles the historical data fairly strongly.

A comparison between the sampled normalised model results and real-world data of the UK showed that the Pearson correlation coefficient is 0.92 with a corresponding p-value of 5.09×1057, and the Spearman's rank correlation coefficient is 0.95 with a corresponding p-value of 2.46×1071. Both correlation coefficients are very close to 1 and the p-values are very close to zero, which indicates that the model can produce a very accurate replica of the pandemic in the United Kingdom and successfully captured the trend.

3.5. Discussion

Although the results from the univariate analysis suggested that a lockdown was the quickest way to instantly reduce the infection numbers in the shortest time out of the four safety measures, the results from the multivariate analysis showed that social distancing and usage of surgical masks have a more significant effect on reducing infection numbers, as they have relatively higher values of μ* and σ.

4. Conclusion and future work

Whilst the proposed model can produce a reasonably fair estimation of the general trend of COVID-19 as well as the effectiveness of certain preventative measures, the model needs to take into account more real-life factors for it to be used by policy makers to make key decisions. Due to its ease of use and flexibility, it can be used as a tool and an information source for policy makers to test various hypotheses in combination with other research and models and to visualise how certain interventions will affect the infection trend.

One could extend this model in the future by adding features such as small venues, schools and transport, which will allow the model to make even more realistic predictions. One can also continue to monitor trends from different countries and observe how the findings of the Morris Elementary Effects method relate to the trend.

In conclusion, the results from the univariate and multivariate analysis have strongly suggested that there would be no need to enforce a lockdown at all if a sufficient proportion of the population followed the social distancing and mask usage guidelines, as the peak infection number would be sufficiently controlled and relatively low.

Declaration of competing interest

All authors, Kelvin K.F. Li, Stephen A. Jarvis and Fayyaz Minhas, report no conflicts of interest relevant to this article.

Acknowledgements

This study was based on the findings of the lead author's dissertation project which was completed during his time at the University of Warwick, although he is now working as a Data Scientist for the Chinese University of Hong Kong.

FM is supported by the PathLAKE digital pathology consortium which is funded from the Data to Early Diagnosis and Precision Medicine strand of the government's Industrial Strategy Challenge Fund, managed and delivered by UK Research and Innovation (UKRI).

References


Articles from Computers in Biology and Medicine are provided here courtesy of Elsevier

RESOURCES