Abstract
The COVID-19 pandemic has provided stiff challenges for planning and resourcing in health services in the UK and worldwide. Epidemiological models can provide simulations of how infectious disease might progress in a population given certain parameters. We adapted an agent-based model of COVID-19 to inform planning and decision-making within a healthcare setting, and created a software framework that automates processes for calibrating the model parameters to health data and allows the model to be run at national population scale on National Health Service (NHS) infrastructure. We developed a method for calibrating the model to three daily data streams (hospital admissions, intensive care occupancy, and deaths), and demonstrate that on cross-validation the model fits acceptably to unseen data streams including official estimates of COVID-19 incidence. Once calibrated, we use the model to simulate future scenarios of the spread of COVID-19 in England and show that the simulations provide useful projections of future COVID-19 clinical demand. These simulations were used to support operational planning in the NHS in England, and we present the example of the use of these simulations in projecting future clinical demand during the rollout of the national COVID-19 vaccination programme. Being able to investigate uncertainty and test sensitivities was particularly important to the operational planning team. This epidemiological model operates within an ecosystem of data technologies, drawing on a range of NHS, government and academic data sources, and provides results to strategists, planners and downstream data systems. We discuss the data resources that enabled this work and the data challenges that were faced.
Keywords: Epidemiology, Modelling, Agent-based models, Model calibration, Healthcare, Data
1. Introduction
The COVID-19 pandemic has provided stiff challenges for planning and resourcing in health services in the UK and worldwide. Since the first recorded cases of COVID-19 in England on 29 January 2020 the National Health Service (NHS) in England has to date admitted at least 761,314 patients with COVID-19 to hospital (Gov.uk, 2021a) while also working to provide emergency and elective services. During this time the NHS has rapidly developed a number of key data technologies to monitor the epidemic and to plan its response, including the COVID-19 Data Store (NHS England, 2021a), Coronavirus (COVID-19) Research Platform (NHS England, 2021b), and a range of advanced analytic tools that help to understand and anticipate both COVID-19 and non-COVID-19 demand on health and care services.
In the early days of the pandemic a key need emerged within the NHS for future scenarios of COVID-19 hospital admissions and bed occupancy to inform strategic decisions and local system planning. Epidemiological models are a tool that can help organisations respond to infectious disease threats by providing projections of how disease might progress in a population. Epidemiological models of COVID-19 have been used through the pandemic to inform government and healthcare decision making worldwide (Adam, 2020, Eikenberry et al., 2020, Endo et al., 2020, Ferguson et al., 2020, Kucharski et al., 2020). In the UK expert university groups have provided modelling support to the COVID-19 response through the UK Government’s SPI-M-O group (Scientific Pandemic Influenza Group on Modelling, Operational). However, in Spring 2020 the submitted epidemiological models of COVID-19 did not readily support NHS local planning needs, including the capacity to project hospital admissions, test sensitivity to assumptions, understand uncertainty, and account for geographical variation. To inform local system planning, the NHS sought an epidemiological model that: (a) could be automatically calibrated daily with the latest NHS data; (b) could be calibrated and simulated at the level of sub-national NHS geographies; (c) could estimate the effect of potential policy and health interventions on projected hospital admissions over the next 6 weeks; (d) could quantify the uncertainty of the projections from the model parameters; and (e) had code that was Open Source, unit-tested and easily extendable.
In May 2020, OpenABM-Covid19 satisfied all these criteria and was chosen for this project. Here we present our experiences of adapting and using this model to inform planning and decision making within the NHS in England. Additionally we describe our data and calibration software framework, which automates the analyses. We show that our method produces accurate calibrations to censored health data, allowing flexible simulation of hospital admissions and occupancy scenarios for healthcare planning and for investigation of uncertainties. We highlight how we adapted the model to address some of the emerging challenges through the epidemic including: changes in government policy, new variants of concern, and the NHS vaccination programme. Finally, we discuss the utility of this framework for modelling the second and third waves of the COVID-19 epidemic in England with focus on the data resources that enabled this work and the data challenges that were faced.
2. Materials and methods
2.1. OpenABM-Covid19 is a detailed epidemic model of the spread of COVID-19
OpenABM-Covid19 is an agent-based model of the COVID-19 epidemic developed to simulate the spread of SARS-CoV-19 in an urban environment (Hinch et al., 2021). It is a detailed epidemic model including age-stratification and realistic social networks. Each individual member of the population is represented as a node within multiple networks representing daily interactions across the different settings: a work network (including school attendance for children); a household network; and a random network accounting for social and other interactions within the community. Network parameters are chosen such that the average number of interactions match age-stratified data reported in a European study of social contacts and mixing patterns (Mossong et al., 2008). Infections are seeded in the population and spread through the networks day-by-day. The model takes into account asymptomatic infections and different stages of severity, and includes the simulation of hospitalisations and ICU admissions. Since symptoms, disease progression and infectiousness are highly age-dependent, disease pathways in the model are age-stratified. By default the model population is one million inhabitants with demographic structure based upon UK-wide census data, and is parameterised to UK demographics (including population size, age profile and household structure) and calibrated to the early UK epidemic; however, it is easily re-parameterised for other populations. The default biological and epidemiological characteristics of COVID-19 disease have been derived from the scientific literature (see Hinch et al., 2021). OpenABM-Covid19 enables the simulation of a range of interventions to limit disease spread including different types of social distancing measures, both manual and digital contact tracing, and vaccination programmes. During the course of this work OpenABM-Covid19 has been developed to increase the user control of key immunological features of SARS-CoV-2 including natural immunity and immunity waning. The model was originally designed to assist and evaluate contact tracing strategies (Hinch et al., 2020, Abueg et al., 2021) and is now supporting comprehensive modelling and analysis of the transmission dynamics of the virus and contributing to the nowcasting metrics such as the reproduction number R used for tracking the epidemic status within the UK Health Security Agency (UKHSA).
2.2. abmcal is a software framework for calibrating and running OpenABM-Covid19 at scale
abmcal is a Python-based software framework for calibrating and running OpenABM-Covid19 at scale. This software was developed in collaboration between the NHS and industry to enable NHS use of OpenABM-Covid19. It provides functionality for calibrating selected parameters of OpenABM-Covid19 to observed data and for simulation of scenarios at national population scale on NHS computational infrastructure, as well as additional functionality for adjusting model parameters to fit externally generated scenarios. The main model methods are controlled via user-friendly shell commands, which allow the user to specify key model inputs (e.g. epidemiological parameters, NPI details, vaccination schedules). Model uncertainty is assessed by running multiple simulations using different random seeds and aggregating the results to produce central estimates (proxied by the median of the simulations) as well as estimates of stochastic variation (proxied by the range within the 10th and 90th percentiles). Comprehensive model outputs are produced including a full history of infection, transmission and hospital use for each geography, age-group and network. Model outputs also include a set of programmatically generated diagnostic plots to enable rapid checks and diagnostics.
abmcal has two primary modes of operation: calibration and simulation. Calibration estimates model parameters to fit observed local epidemiological and healthcare data. Simulation allows the user to explore hypothetical future scenarios by dynamically updating network and transmission parameters as the simulation progresses to investigate potential effects on clinical metrics including hospital admissions and critical care occupancy. Simulation enables the user to investigate the effects of vaccinations and NPIs, for example social mixing restrictions that limit the number of contacts between individuals (i.e. lockdowns). For both modes the model outputs are available for examination at national, and NHS regional and sub-regional levels.
abmcal was built to allow OpenABM-Covid19 to be run at large-scale matching NHS geographies; for example, it enables calibration and simulation in populations of > 50 million agents through the parallel computation of 42 smaller simulations matched to Integrated Care System (ICS) footprints and demographics, maintaining a 1:1 relationship with the national population of England. To achieve population-scale epidemiological simulation and determine stochastic uncertainty the software provisions compute clusters hosted on Microsoft's cloud, using the Azure Batch service. abmcal also interfaces with the NHS COVID-19 Data Store to provide secure Application Programming Interface (API)-based access to up-to-date NHS SitRep data for model calibration and verification. Scenarios are returned to the COVID-19 Data Store by the user through a secured API call initiated from the command line where they are used by downstream models including projections of future bed occupancy, demand for medicines and Long COVID services, and the Early Warning System — the NHS Trust-level 3-week forecast of hospital admissions and bed occupancy (NHS Early Warning System Team, In Preparation). In addition, key results are reviewed by a multidisciplinary team of data scientists, modellers and epidemiologists and are quickly converted to visualisations and inserted into management information reports used by strategic decision makers. We present this architecture in Fig. 1. While abmcal was initially built to run on the Azure Batch service subsequent development work has focussed on making it adaptable to other high-performance computing environments.
Fig. 1.
Using OpenABM-Covid19 to produce national scale epidemiologic scenarios. Running the model to meet the key operational needs of the NHS in England is achieved by combining multiple data technologies. (1) OpenABM-Covid19 and abmcal are retrieved from their respective GitHub repositories. (2) abmcal is used to calibrate epidemiologic parameters to fit the observed SitReps which are retrieved via secure API access from the COVID-19 Data Store. (3) The calibrated parameters are then used to simulate scenarios and test sensitivities. (4) Both calibration and simulation are computationally intensive processes requiring access to a high performance computing (HPC) environment. abmcal currently utilises the Azure Batch service for this purpose but can be rewritten for other HPC environments. (5) Simulation results reviewed by the NHS modelling team and are returned to the COVID-19 Data Store where they are used by downstream modelling teams including the Early Warning System (EWS), Bed Occupancy, and Long COVID model. (6) Key results are quickly converted to visualisations and inserted into a management information report used by strategic decision makers.
2.3. Calibration method
Through the COVID-19 pandemic we have used abmcal to regularly calibrate OpenABM-Covid19 to up-to-date NHS SitRep data in England. We first create a 1:1 national-scale population of approximately 59-million agents representative of the household and age composition of the population of England using the Master Patient Index—a patient-level register containing deidentified demographic information at local geographical level. To create manageable populations for the model we split these agents into 42 separate populations corresponding to the 42 NHS England ICS geographies (population size 504,041 to 3067,569). We generate a matrix for each ICS where the primary axis corresponds to each place of residence (i.e. household) identified in the dataset, a second axis corresponds groups of 10-year age bands (0–9 through to 80 +) and each value represents a count of individuals within each age band in each place of residence. To allow feasible computation large institutional residences (e.g. residential care homes and prisons) are randomly subdivided into household groupings of six residents – the maximum household size used in OpenABM-Covid19, which aligns to the ratio of care home residents to staff at the time and was a reasonable subdivision given the infections control measures in place at the time. Transfer of infection between these small household groupings is facilitated via the elderly occupation network. We take initial epidemiological parameters from the OpenABM-Covid19 defaults including age-specific disease dynamics parameters, hospitalisation fractions and fatality fractions (described in detail in Hinch et al. 2021). We estimate the network-specific proportions of daily interactions between agents through the epidemic (contact parameters; relative to pre-pandemic levels) using the CoMix study of UK social contacts to set levels relative to a benchmark European study of social contacts and mixing patterns (Mossong et al., 2008, Gimma et al., 2021), and obtain up-to-date vaccination data from the NHS vaccine DataMart (part of the COVID-19 Data Store). To account for variability in behaviour across regions of England, abmcal will accept a different set of contact parameters and vaccination schedules for each region and/or ICS. Furthermore, OpenABM-Covid19 allows the control of the amount of contact and strength of transmission on agents’ home, age-stratified occupation, and random networks relative to their pre-pandemic baseline levels of contact and risk of transmission (see Hinch et al. 2021 for a full description of how this is parameterised in the model). As a result the likelihood of an individual contact leading to a transmission will be dependent on the network they interact on, the individual’s age, and the stage/severity of infection.
We developed a calibration method based on hyperparameter grid search of a defined parameter space to set the epidemiological parameters ( Fig. 2). This method was developed as a pragmatic solution in the early stages of the pandemic based on our operational constraints including available computational resources and desired speed of implementation. We first calibrate two epidemiological parameters to match observed hospital admissions, critical care occupancy and deaths observed over the first peak of the COVID-19 epidemic in England (up to 1 June 2020). To do this we fix assumptions on the contact parameters that will be used in the simulation and run a grid search to calibrate the mean infectious period (a global parameter that is not geography specific) and then the infectious rate (a per-geography parameter that controls the mean number of individuals infected by each infectious individual with moderate to severe symptoms). We use a uniform grid of 10 different values of the target parameters centred on the default value defined in the initial epidemiological parameters of OpenABM-Covid19. After each search epoch, the value of each parameter that yields the smallest loss (using mean squared error) is selected, and the global loss is calculated and compared to that of the best parameter from the prior epoch. If the improvement in loss is < 5% then the search is deemed complete, otherwise calibration advances to a further epoch. For the following epoch a grid is created centred at this new value and the width of the grid is reduced to half of the previous width. This approach returns a mean infectious period of 6.83 days and a generation time of 5.42 days for the wild type SARS-CoV-2 virus, consistent with other published estimates available at the time (Ferretti et al., 2020, He et al., 2020). Once the best-fit mean infectious period and infectious rate parameters are identified we hold these constant and run a further grid search over the entire epidemic curve to calibrate the per-geography relative transmission strengths for occupation and random networks to fit the observed hospital admission SitReps (given age-stratified individual contact levels on different networks, interventions such as vaccinations and school closures, and the underlying differences in transmission between networks). This step typically runs for two epochs and returns a single relative infectious rate that is applied in a uniform fashion across both occupational and random network transmission parameters. Our calibration method calibrates the transmission risk in occupation and random networks only based on the assumption that NPIs had little effect on transmission risk between household contacts. We review the outputs after each epoch to verify that the calibration process had avoided any unnatural local minima and the parameter value identified could be considered appropriate for the purposes of further calibration and simulation. We check the output transmission strengths to ensure that they are relatively stable over time; the absence of a stable pattern indicates that other input parameters need to be re-checked.
Fig. 2.
Two-stage calibration of static parameters, followed by time-sensitive parameters. (A; Upper Panel) The first stage of calibration uses a grid search to find a single best value for the mean_infectious_period parameter of OpenABM (representing the average number of days an agent is infectious for) based on a predefined range of 4.5 to 7 days. Following this, best fit values are found for infectious_rate (the R rate) for each ICS within a range of 2.9–8.7. This accounts for variation in the rate of transmission in different geographical areas owing to factors such as population density and time spent in highly infectious environments given the age and household structure of each ICS. Calibration begins by retrieving pre-calibrated parameters for OpenABM (Hinch et al. 2021), which are used for any parameters not calibrated through this process. For each parameter to be calibrated, a uniform grid of 10 points is created based on a predefined range and 10 simulation runs (randomly seeded) for each of the 10 values is run over the specified time period (March-June 2020). Following simulation, the results of the 10 runs for each parameter are aggregated and a median obtained such that each parameter value contains a single time series. The median modelled daily hospital admission series for each parameter is then compared to NHS SitRep data for actual daily admissions and a mean squared error calculated across the series and the parameter with the smallest loss identified. This constitutes a single calibration epoch. The best loss for each epoch is then compared to the best loss for the previous epoch. In cases where the improvement in loss between epochs is greater than 10% of the total loss for the last epoch, a new grid is created which is centred on the best parameter and with a range that is half the size of the previous epoch and the process repeated. Once loss between epochs falls below a 10% improvement the value for the parameter is set and calibration moves to calibrate the next parameter. Once both parameters have been calibrated the process terminates and the best values (along with data and visualisations for each stage of calibration) are returned. These outputs are then checked for potential errors in calibration performance that might indicate a sub-optimal value has been identified (e.g. non-normally distributed loss around a parameter value). (B; Lower Panel) The second stage of calibration uses a grid search to calibrate a single value for the relative_transmission_occupation and relative_transmission_random parameters for each stage of the Covid19 pandemic. Phases are between 2 and 4 weeks in length and correspond to significant changes to infection control measures (e.g. lockdowns or vaccination milestones). These transmission values are searched over a range of 0.2–1.2 where 1 represents the level of transmission in March 2020 prior to the introduction of the first National Lockdown. Relative transmission strengths capture variation that cannot be explained through other parameterised interventions, such as reduction in contact between agents or the introduction of vaccines and includes factors such as regularity of mask-wearing or hand-washing or adherence to other guidelines. Timesteps representing each phase of the panemic are retrieved based on the inputs provided at the start of each run and an initial grid of 10 values between 0.15 and 1.25 is created and assigned to the two parameters for the given timestep. Each parameter value is simulated 10 times with (using different random seed) and following simulation, the results of the 10 runs for each parameter are aggregated and a median obtained such that each parameter value contains a single time series. The median modelled daily hospital admission series for each parameter is then compared to NHS SitRep data for actual daily admissions and a log-loss function is applied across the series to identify the value with best fit to data. A second epoch is then run with 10 uniformly spaced points to cover range − 0.125/+ 0.15 around the best value obtained from the first epoch. The outputs of parameter values are searched over two epochs for each timestep sequentially, such that the simulations for a given timestep will run from the first date of available data (20 March 2020) to the start of the following timestep (or the last date in simulation). Once values have been calibrated they are locked, meaning that only the portion of simulation covered by the relevant timestep is being calibrated at any given point.
The full calibration process is computationally intensive (e.g. the initial searches for mean infectious period and infectious rate with the default of 10 random seeds requires 4200 individual simulations per epoch and at least 16,800 runs overall), and takes multiple days, even running in parallel on the high-performance Azure Batch system. As a result, once the base calibration is established, we use a simplified method where we re-search the latest model time periods (typically 2–4 week periods defined by policy or other changes to the context of the epidemic) to find transmission strengths that match the most-recent SitRep and vaccination data; these updates run overnight and enable timely updates to calibration to be produced. We routinely checked that these values did not differ significantly from timestep to timestep to counterbalance incorrect parameters elsewhere in the model. We used these simplified calibrations to make fast and timely updates to incorporate the latest data to support NHS planning, producing a new calibration each week during the most demanding phases of the pandemic. To ensure ongoing accuracy of calibration we update the base calibration periodically (every 1–2 months) to include any updates to estimates of epidemiological parameters, and reflect changes in estimated vaccine effectiveness. This process is also necessary to update to the latest version of OpenABM-Covid19 and to add new features to our modelling as they are released in OpenABM-Covid19. A full protocol for calibration with example commands is available on request.
2.4. Simulation method
Once calibrated, we use abmcal to simulate OpenABM-Covid19 scenarios for use in planning and operational decision making. We typically describe and simulate a set of related scenarios that includes our main scenario, alternative hypotheses, sensitivity tests and counterfactuals. A scenario is specified in terms of parameters that change over time which describe the proportion of normal contacts (i.e. pre-COVID-19 numbers of contacts between individuals in the absence of any policy intervention) taking place in different networks, the likelihood of passing on the virus due to precautionary measures such as social distancing, mask wearing and hand washing, and the effect of preventative measures such as vaccination. These parameters are fixed up to the end date of calibration, but are permitted to change on specified future dates. To simulate we use the same national population of agents as was used in the calibration. Our method reads the calibrated parameters directly from the saved output of calibration; we adjust a small number of user-controllable parameters using command line arguments to define the desired scenario set. These user-input parameters include vaccination schedules, reduction of social contacts across different society layers, the impact of different NPIs and emergence of new variants with variable transmissibility, each of which has an easily communicated real-world definition. As with calibration, to account for variability across regions of England, the abmcal simulation method will accept a different set of user-input parameters for each region or ICS. We do not explicitly model exportation of infectious cases across locations, but we re-seed a small number of infections into locations to prevent the epidemic from dying out when the number of infected agents is low. We set vaccination schedules using anticipated age-specific vaccination volumes from the NHS vaccination programme. To set future NPI contacts we use the values from the latest calibrated timestep as a baseline and adjust these based on known future changes to networks (e.g. school holidays) or policy (e.g. the UK government roadmap; Gov.uk, 2021c). To set transmission strengths for simulation we similarly use the parameter values for each ICS found during the calibration method and run these transmission strengths forward into subsequent timesteps. As there is some variability in these parameters from timestep to timestep we typically calculate and carry forward the geometric mean of the last three calibrated values for each ICS. We find this method provides sensible inputs for simulation that provides plausible scenarios at ICS, regional and national levels.
To generate each scenario we run 1000 simulations from the start of the pandemic to a defined time in the future using the same population of agents used in the calibration run, which represents the entire population of England, across different random seeds using the given parameter set. abmcal outputs the median model run as well as 10th and 90th centiles. The resulting outputs include projections for national hospital admissions, as well as regional and subregional projections. We run the models several months forward from the calibration date as we find this provides operationally useful long-range projections balanced against the increasing uncertainty of input parameters over time. Simulating a set of ∼10 scenarios including main, alternative, sensitivity test, and counterfactual scenarios takes a few hours on the Azure Batch system. Scenario outputs provide a comprehensive history of the simulated epidemic, including the spread of infections within geographies and nationally, network-based and age-based transmission estimates, and clinical metrics such as hospital admissions, intensive care occupancy and deaths. The model outputs go through a quality assurance process and are presented to key decision makers as well as being uploaded to the COVID-19 Data Store where they are used as granular and timely inputs to a range of downstream models. A full protocol for simulation with example commands is available in the abmcal documentation.
2.5. Data access and storage
The COVID-19 Data Store was established to ensure that data could be used by the NHS and government to monitor the spread of the virus and implement appropriate measures to ensure services and support is available to patients (NHS England, 2021a). We use the Data Store to securely access key data for this modelling work including daily SitReps, the Master Patient Index, and COVID-19 vaccinations. Daily SitRep data provide rapid information on the demands that COVID-19 placed on the health care sector nationally. The Master Patient Index is a patient-level register containing deidentified demographic information at local geographical level. COVID-19 vaccination data provides a record of vaccinations given to date and includes data on uptake in different cohorts. The Data Store also provides a secure location to upload and share model results, with automated connections to downstream models. In addition to datasets controlled by the NHS, we made use of a number of public and other datasets to parameterise and cross-validate our modelling efforts. These are detailed and discussed in the relevant sections of this manuscript.
3. Results
3.1. OpenABM-Covid19 and abmcal produce stable scenarios useful for planning
To be useful and reliable for planning an epidemiological model needs to be accurately calibrated to observed data and able to produce reliable and useful epidemiological scenarios. We first present an example calibration, showing the fit to SitRep data over the first, second and third waves of the pandemic from 1 March 2020 up to 1 September 2021 ( Fig. 3). The median model runs are very close to the observed data for daily hospital admissions and critical care occupancy (Fig. 3A) with the vast majority of the observed data falling between the 10th and 90th centile model runs (98.3% and 97.9% of observations for hospital admissions and critical care occupancy, respectively. The estimated epidemiological and transmission strength parameters lie within biologically plausible ranges (not shown). By calibrating 42 ICSs individually the model is able to account for geographical variation in disease dynamics and accurately fit to admissions data for the seven NHS regions in England (Fig. 3B). Through the pandemic we have been easily able to update the calibration every 1–2 weeks depending on business need with similarly good fits to SitRep data each time.
Fig. 3.
Maintaining timely calibration of epidemiological parameters to NHS SitRep data. A key requirement for any operational healthcare model is to accurately represent the most up to date information. abmcal is used to calibrate OpenABM-Covid19 to the latest SitRep data. (A) National calibration is done by calibrating three parameters (infectious rate, mean infectious period and transmission strength) to hospital admission, intensive care occupancy and death SitRep data accessed from the COVID-19 Data Store. (B) By calibrating 42 ICSs individually the model is able to account for geographical variation in disease dynamics and accurately fit to admissions data for the 7 NHS regions in England.
To check how well the calibration represents the epidemiology of the epidemic we cross-validate the calibrated model output against publicly available incidence data from the UK Office for National Statistics (ONS) Coronavirus (COVID-19) Infection Survey (Office for National Statistics, 2021a, Office for National Statistics, 2021b) and official case data. Importantly, the model is calibrated to hospital admissions and subsequent clinical events, but undergoes no calibration to infection, positivity or case data. Nonetheless the calibration shows good concordance with case data and ONS published estimates of incidence at national ( Fig. 4A) and subnational (Fig. 4B) levels indicating that the model is usefully representative of the epidemic in England. In addition, the calibrated model provides a useful estimate of the number of infections that occurred over the first peak of the epidemic in England, before testing and survey measures became widely available.
Fig. 4.
Example cross-validation of the model calibration against other, unseen, data sources. (A) The national calibration shows good concordance with the incidence estimates (shown in green) for England produced by the Office for National Statistics and with a separate estimate of incidence calculated from case data (purple points) such that the number of reported cases is rescaled to match the total number of infections reported by the ONS infection survey over the period 16 May 2021–12 January 2022. The ONS estimates incidence based on PCR positivity in private households among a sample of the population. As our model does not directly output PCR positivity we show the total number of modelled infections as comparison. (B) The regional calibration shows similarly good concordance with the regional estimate of incidence calculated from case data using the above method. The ONS infections survey does not report regional incidence for comparison.
We next show a typical scenario set of three hospital admissions scenarios simulated from this calibration ( Fig. 5A). These scenarios model the end of stage 4 government restrictions in England and the subsequent start of the school term in September 2021. These scenarios were specified by adjusting future contact parameters to account for expected changes in behaviour (based on expert opinion and on prior observed changes in contact behaviour among the population) and are aligned to the scenarios produced by SPI-M-O. The scenarios include school holidays and term times; however, they do not explicitly model the Christmas period, the emergence of the Omicron variant, and any associated changes in behaviour in December 2021. The scenarios model three possible schedules of return to normal contact levels among the population following the end of restrictions on 19 July 2021 and project an ongoing pressure of 500–1500 daily COVID-19 admissions through to the end of November, with the possibility of a higher peak if a rapid return to normal contacts coincided with the start of the school year in September. By overlaying the subsequent SitRep admissions data we show that these data remained within the credible interval of the main scenario for the subsequent 4 months (until the emergence of the Omicron variant). As a result of the ICS level calibration, and the different levels of transmission, prior infections and vaccinations between different geographies, the scenarios provide useful projections of the course of the epidemic across NHS regions (Fig. 5B). These simulated scenario outputs were visualised and shared with planners and strategic decision makers, as well as uploaded to the COVID-19 Data Store where the data were used as inputs to downstream models.
Fig. 5.
Simulated scenarios based on the calibration in Fig. 2. (A) Three scenarios simulated in September 2021 based on different assumptions about return to normal levels of contact among the population (no change from existing levels, gradual change back to 100% of pre-pandemic levels, and rapid change back to 100% of pre-pandemic levels). These scenarios projected 500–1500 daily admissions through to the end of November, with the possibility of a higher peak if a rapid return to normal contacts coincided with the start of the school year in September (green line). SitRep admissions remained within the credible interval of the main scenario (gradual return to normal contacts; black line) until mid-December (a period of 4 months) before the emergence of the Omicron variant resulted in rapid infection growth. These scenarios do not explicitly model the Christmas period, the emergence of the Omicron variant, and any associated changes in behaviour in December 2021. The third scenario is a counterfactual example with no future change to contacts (blue line). (B) The main and rapid-September scenarios provided useful projections of the course of the epidemic for periods of 2–3 months across 6 of 7 NHS regions. In the South East of England the observed SitReps deviated from the main scenario after the September change point and remained closest to the no-change (blue) line. We hypothesise that continued high levels of home-working in this region might have led to a different rate of return to normal contact levels relative to other regions. These scenarios were updated approximately every 2 weeks, after recalibration to the latest SitRep data.
Finally, we present a compilation of all the ‘central assumptions’ scenarios for hospital admissions run from October 2020 to February 2022 (the time period when the model was being used for operational planning purposes; Fig. 6). These scenarios represented the most plausible trajectory of hospital admissions based upon the best available evidence at the time. We show 6 weeks of forward simulation for each scenario commencing on the date on which the simulation was done. It is important to note that the data points seen by calibration would be approximately 6–7 days behind the start of each line as a result of lags in data capture (3–4 days) and the time taken to complete a calibration and simulation (2–3 days). Scenarios typically show a good match to observed data and typically mirror the shape of the admissions curve. Some exceptions occur a) in cases where public policy changed dramatically within the 6 week period (e.g. November 2020 lockdown); b) when the non-matching scenario is part of a optimistic/pessimistic pair (e.g. July 2021 UK Government Roadmap Stage 4); and c) as the result of an incorrect assumption in inputs (e.g. the emergence of the Omicron varant in December 2021). Crucially, identifying periods of risk for high hospital admissions was important for operational planning, even when these admissions did not subsequently materialise because of interventions. For example, in October 2020, the NHS was prepared for a potential peak of ∼5500 daily admissions nationally over November and December of 2020 even though policy changes (i.e. lockdowns) meant this peak did not occur. At the time, these scenarios were in line with the Medium-Term Projections of the SPI-M-O models, which were used as informed evidence for the lockdown decided shortly thereafter.
Fig. 6.
Comparison of hospital admissions outputs in ‘central assumptions’ against NHS SitRep data. This figure shows 6 weeks of forward projection for each of the scenarios run using abmcal and OpenABM from October 2020 to February 2022 which represented the ‘best guess’ based on known interventions at the time. Typically, there was one ‘central assumption’ per reporting date, though on occasion (particularly around the introduction of changes to social distancing policies, we considered two ‘central assumptions’ as a high and low bound; this approach can be seen at three points (Nov 2020, January 2021, and July 2021), each representing uncertainty in the likely effectiveness of the implementation or easing of lockdown restrictions. Dashed purple lines show the median output of simulations for each scenario with the purple shaded area covering the 10th-90th percentiles. Green dots display the actual admissions figure for each day (from NHS SitRep data to which the model was calibrated). Each dotted line begins on the date on which the run was conducted and is shown for the next six weeks. Vertical grey dashed lines show the date on which key policy changes were implemented and are accompanied by labels describing the change at that point. It should be noted that most of these policy changes were not agreed upon and made public until a couple of days prior to coming into effect, and the changes observed in scenarios can be clearly seen in the windows immediately preceding and immediately following their introduction.
3.2. Simulated scenarios provided critical intelligence for planning in Winter 2020
As the ability to support planning and decision-making is the critical role of these simulations we next present an example of how OpenABM-Covid19 and abmcal were used by the NHS in England to support these functions ( Fig. 7). This modelling was produced during the period October to December 2020 in order to provide planning scenarios that took into account known or expected government policy interventions. Government policy interventions used in England through the pandemic includes lockdowns, school closures, work from home instructions, hospitality venue closures, social distancing measures, and mask wearing. We include these interventions in our modelling through adjustment of the number of contacts to reflect changes in the simulated NPIs. With cases and admissions growing exponentially through October, initial modelling showed the potential for a substantial admissions peak (Fig. 7A). The UK Government imposed a national lockdown in England through much of November 2020, (although schools remained open) which acted to somewhat reduce case and admissions numbers. However, our modelling conducted during November ahead of the proposed release from lockdown identified that while the national numbers were declining, there were multiple NHS geographies that had exhibited exponential growth through the lockdown (not shown). With the national prevalence of COVID-19 still high the potential national admissions peak was still high (but time-shifted into early January; Fig. 7B). This information was used by local system planners ahead of the crucial Christmas and New Year period for hospitals. By systematically varying our assumptions of NPI contact levels through December we were able to explore scenarios where contacts returned to a higher level than they had been prior to the lockdown (e.g. if people ‘made up for lost time’ by increasing the number of contacts they made through the month of December 2020). By considering this range of scenarios with higher transmission we were somewhat prepared for the emergence of the Alpha (B.1.1.7) variant of concern in December 2020 (Davies et al., 2021) – in this case we had modelled increased transmission driven by increased contacts, which was functionally analogous to increased transmission driven by increased biological transmissibility in the context of normal levels of contacts. Through December the government applied a set of local and regional restrictions to the UK, which we modelled by quickly adjusting expected future contact parameters at ICS level with updated scenarios typically produced within 2 working days of the restrictions being announced (Fig. 7C). In January 2021 the government announced a new national lockdown including a school closure. Modelling produced at this time (Fig. 7D) showed that the combination of these two measures (lockdown plus school closure) was likely to be sufficient to alleviate the national admissions pressure and to enable planners to plan for reductions in COVID-19 admissions (e.g. to plan to increase the number of elective surgeries done).
Fig. 7.
Simulated scenarios provided critical intelligence for strategic decisions in Winter 2020. Each panel shows an example range of scenarios simulated at a given time point with each scenario representing different proposed interventions or different levels of adherence to interventions. (A) Simulated scenarios in mid-October 2020 projected a large admissions peak that would put significant pressure on NHS resources in England. (B) Lockdown policy in November alleviated the immediate pressure; however modelling continued to project a high (but time-shifted) admissions peak timed for early January 2021. (C) Local and regional lockdown policies in December 2021 reduced the height of the projected admissions peak but still suggested sustained admissions pressure. (D) The announced full national lockdown (lower lines) was projected to substantially reduce the daily admissions pressure relative to the counterfactual no-lockdown scenario (upper line).
3.3. NHS vaccine programme scenarios: an end-to-end example
A significant development in the course of the epidemic has been the rapid development and approval of COVID-19 vaccines and the subsequent rollout of vaccination programmes worldwide. The NHS in England has been at the forefront of this endeavour. We have used simulated scenarios to project the potential effects of the NHS vaccination programme on hospital admissions in England ( Fig. 8). We present this work as a short case study as it highlights a number of the key benefits of working with an open source ABM including having a natural description of a system and the flexibility to add new interventions.
Fig. 8.
Adapting to evolving healthcare circumstances by modelling vaccine programme scenarios. Modelling done at the start of the NHS vaccine rollout programme showed the anticipated effect on hospital admissions in England. Ahead of vaccine approval developers of OpenABM-Covid19 and abmcal worked to add a vaccination feature to the model and calibration software. (A) The model was calibrated to actual vaccine uptakes obtained from the vaccine data mart. (B) Regional differences in uptake were taken into account in the calibration. (C) Simulated scenarios show a minimal difference in daily hospitalisations while COVID-19 prevalence was declining in early 2021. A large difference in projected daily admissions between vaccinated and unvaccinated scenarios can be seen as the staged release from lockdown progresses.
Ahead of vaccine approval developers of OpenABM-Covid19 and abmcal worked to add a vaccination feature to the model and calibration software. From December 2020 we modified our calibration method to take into account the actual per-age-group vaccine uptakes obtained from the vaccine data mart (Fig. 8A). Regional differences in vaccine uptake were taken into account in the calibration, with the most significant difference being the uptake in London versus the other six NHS regions (Fig. 8B). Simulated scenarios were based on assumptions about future vaccine roll out informed by the vaccination team and government assumptions used in SPI-M-O modelling (Gov.uk, 2021b). From February 2021 the scenarios also modelled the government roadmap out of lockdown (Gov.uk, 2021c) by setting future contact parameters to match those observed at comparable prior points in the epidemic. These scenarios show a minimal difference in daily hospitalisations during early 2021 when lockdown restrictions were in place and while COVID-19 prevalence was declining. However, a large difference in projected daily admissions between vaccinated and unvaccinated scenarios can be seen as the staged release from lockdown progresses and modelled social contacts increase (Fig. 8C). Detailed modelled outputs enabled the numbers of vaccinated and protected individuals to be visualised over time allowing an evidence-based narrative to be built around the projected benefits of the vaccination programme. At the time of this modelling in late March 2021 the main central scenario showed an admissions peak of around 1000 per day at the end of August 2021. The observed late August peak of 843 admissions is within these bounds and these projections enabled planning for elective recovery and other priorities. Sensitivities to different vaccine effectiveness estimates were tested and the effectiveness of the vaccine against admission in the central scenario was updated in line with emerging evidence.
4. Discussion
This paper demonstrates the adaptation and application of an established agent-based model – OpenABM-Covid-19 – calibrated to daily NHS SitRep data, for operational monitoring and responding to the COVID-19 epidemic within the NHS in England. We illustrate an operational methodology (abmcal) to calibrate the model to the latest SitRep data, with a quick and robust protocol enabling updates to a calibration to be made in a matter of hours. The calibrated model provided a well-understood, timely and useful representation of the epidemic status, and formed the basis for simulating planning scenarios for national and local decision makers. These scenarios were used alongside other sources of intelligence to inform operational planning and for strategic decision-making within the NHS in England through the epidemic, and especially at key points in time such as the emergence of the Alpha Variant of Concern in Winter 2020 and the rollout of the NHS vaccination programme through 2021.
We present this work as an example of how a complex model was used to generate planning scenarios in a challenging and time-sensitive operational healthcare setting. Scenarios provide outlines of possible futures subject to model assumptions and caveats; they are explicitly not used as forecasts or predictions of exactly what will happen in the future. Scenarios can be operationally useful at key timepoints (e.g. to assist planning when there is little available data/information on the effect of policy changes and new interventions), even if the median projections ultimately prove inaccurate. We have used OpenABM-Covid19 and abmcal to simulate a range of possible future scenarios on an ongoing basis, based on plausible parameter estimates, consistent with those obtained from SPI-M-O and COVID-19 epidemiological literature review to enable systems to plan to a well-defined range of possible outcomes. We reviewed these scenarios regularly within our multidisciplinary analytic teams and with senior clinical decision-makers and cross-validated against published SPI-M-O scenario modelling (Gov.uk 2021b). Over time, as new SitRep data emerged we discounted the least plausible scenarios. Our stakeholders knew these scenarios were unlikely to hold any forecast accuracy beyond 4–6 weeks (not least because of unforeseen changes to interventions); however the longer window allows for planning discussions to focus upon a range of evidence-informed potential futures. NHS planning teams have used these scenarios to plan for the height of a potential admissions peak (representing the maximum pressure on the system), the timing of the peak (the timepoint of the maximum pressure), and the area under the curve (representing the total admissions pressure on the system). Our modelled scenarios have provided stable long-range projections of the disease course, which have supported the NHS response to the pandemic in England and provided inputs to downstream modelling efforts (including bed demand, medicines demand, Long COVID demand, and the Early Warning System models).
Our method has enabled the use of OpenABM-Covid19 to produce timely and granular national-scale epidemiologic simulations using a large population of agents, calibrated to the latest data. However, it is important to highlight that this method was developed at pace in the early stages of the epidemic in 2020 when information about the emergent virus was limited, and as a result there are several improvements that could be considered in light of current knowledge of COVID-19 and available data in order to improve the accuracy and computational efficiency of the calibration. One future avenue for development could be to investigate other, potentially less computationally intensive approaches to calibration, which cover a more comprehensive parameter space, and give a greater view of the confidence and error in the parameter space. We note that the calibration used for this analysis was not the classic calibration method based on a formal Bayesian approach (e.g. Approximate Bayesian Computation) that uses prior distributions on a number of model parameters to sample through the parameter space and produce a set of posterior distributions across the key model parameters and for the course of the epidemic. Instead, for the analyses here we utilised a fast hyperparameter search methodology that didn’t produce posterior distributions across all parameters deemed relevant, but instead derived a reasonable parameter set for which the model outputs for the key epidemic outcomes were matched. This was fit for the purpose of the analytical exploration for capacity and healthcare planning at the NHS at the time of scarce data and fast science to inform decisions. In the context of ABMs, developing robust and efficient calibration methods is an active ongoing research theme; taking a more qualitative approach (if hyperparameter search can be called this) is not wrong if an acceptable fit and uncertainty in the outcome can be generated.
In any operational modelling setting there is a critical tradeoff to be made between ease and robustness of calibration, operationally fit-for-purpose model outputs and resource consumption. We recognise that our operational prioritisation of speed may at times have been achieved at the expense of less robust calibration and that there are likely ways to reduce the computational burden. While repeated grid searching is computationally intensive, this method was sufficient for our purposes – returning parameter values within expected ranges – and its simplicity and consistency ensured calibration could be done to a consistent operational cadence and the calibration itself was readily interpretable. Grid-search approaches for calibrating large-scale individual-based models having been used for other COVID-19 models (Abueg et al., 2021, Kerr et al., 2021). For our operational use cases an essential part of the calibration process was the review of the parameters returned and the careful presentation of results to planners and decision makers alongside relevant caveats and explanation. Focussing on a simple calibration process that minimised the number of parameters fitted also aided with communication to planning and decision making teams, and the fact that reasons for concerns or caution could be readily explained was instrumental in building understanding and confidence in how the model could be appropriately used. Working with this same calibration approach repeatedly over the course of the pandemic allowed NHS analysts to quickly understand when calibration was performing as expected (finding values for which variation between timesteps was small and consistent with expectation), or unexpectedly (with significant changes to parameter values and difficulty fitting to data). While more advanced calibration methods would be possible we note that to be operationally useful, these approaches must be developed alongside a framework for explaining the outputs to individuals who do not have a detailed knowledge of the statistical methods employed.
In carrying out this work we encountered challenges in parameterising and running the model. At the start of this work no good data existed on the number of COVID-19 infections occurring in the population. As a result we took the decision to calibrate the model to the available NHS SitRep data: hospitalisations, intensive care occupancy, and deaths; we did not attempt to calibrate to any estimates of incidence or prevalence. Because the number of hospitalisations in each ICS was low we were sometimes fitting the model to very low admissions, ICU occupancy and death values: in the 10–100 range. As a result, we selected to use a 1:1 simulation of the population of England as calibrating and simulating using only a subset of a few thousand agents was not a viable option at that time, or during other ‘low-covid’ periods. The calibration process estimates key parameters of the disease, and when run forward the model estimates the number of infections using these and the default parameters provided by the authors of OpenABM-Covid19. These data had the advantage of being quickly and consistently recorded. From Summer 2020 estimates of incidence and prevalence of COVID-19 in England have been reported by government and university sources (Elliott et al., 2021, Office for National Statistics, 2021a, Office for National Statistics, 2021b, ZOE COVID Study, 2021) and we have taken steps to cross-validate the model estimates of these parameters against these sources. Nonetheless, there have been gaps in the reporting from these studies, and our model has provided timely estimates of incidence in these cases, which have been used in downstream models (for example Long COVID demand). Our model also provides an estimate of local incidence in the first peak of the epidemic in England, which has been useful in understanding the total numbers of infected and recovered individuals in different geographies and planning services accordingly. It should be noted that much of the dialogue and official reporting in England has focused on the number of cases (confirmed positive PCR tests), which likely underestimates the number of daily infections by 2–3-fold (Colman et al., 2021) with similar low infection detection rates seen in other nations such as the US (Irons and Raftery, 2021).
A key challenge with using simulated scenarios for planning lies in providing sensible parameter inputs for future time periods. During the course of the epidemic in England we have seen three key changes that have necessitated changes in our modelling approach. First, starting in March 2020 we have seen a series of government interventions designed to limit the spread and impact of the virus, including lockdowns, school closures, social distancing measures, and mandatory mask wearing. Second, there have been changes to the SARS-CoV-2 virus itself, with the emergence and spread of the Alpha (B.1.1.7), Delta (B.1.617.2) and different Omicron variants through the population in Autumn 2020, Summer 2021 and Autumn of 2021, respectively. Forseeing the emergence of such variants, and identifying their intrinsic characteristics such as transmissibility is difficult, and impossible to model prior to their emergence and spread. Third, the development and rollout of the NHS COVID-19 vaccination programme has had a marked impact on the transmission and severity of the disease, changing the relationships between infection and clinical outcomes. We note that modelling within-host components such as waning of immunity from vaccination and/or natural infections is difficult and there is scarce data to benchmark against this. In each case, these changes have required changes to the technical modelling framework to reflect them, and have affected the validity of simulated scenario projections. For example, assuming that Delta is twice as transmissible as Alpha would produce a completely different simulated future epidemic wave compared to if Delta was three times more transmissible. Similarly different assumptions of the vaccine efficacy of the emerging new variant will profoundly affect medium-term and long-term epidemic trajectories. Thus a programme of re-parameterisation and model development needed to be undertaken at pivotal times of policy, behaviour and interventions timelines to produce updated scenarios reflective of these new biological, social biological and clinical realities. In this regard the open, customisable properties of the model, coupled with the fast availability of key data sources (both NHS SitRep and external data) have been a considerable advantage, allowing scope for rapid experimentation, development and testing to provide scenario-based support to decision-makers during periods of change and uncertainty.
Our modelling proved to be sensitive to NPI contact parameters. Following the method of OpenABM-Covid19 we used the number of contacts as a proportion of PolyMod, with historic contact estimates informed by the CoMix study. By contrast, some other modelling groups have used mobile phone mobility data to parameterise contact rates. However increased travel is not necessarily a proxy for increased contacts; interestingly mobility returned to normal levels much quicker than did contact data. For generation of forward-looking scenarios we have used a number of methods to generate future contact parameters, including discussion within our multidisciplinary analytic team, matching the government roadmap to prior policy points, and aligning with the approaches of SPI-M-O groups (Gov.uk 2021b). Through this work our modelling team became skilled at reading these complex contact data and adjusting them to match anticipated NPI policy changes. However, this was a laborious task, prone to human error. We therefore highlight that a key data improvement for responding to any future infectious disease threat would be a timely data stream capturing the number and nature of person-to-person contacts, ideally broken down by geography, age and interaction network.
A key limitation of our work is that Open-ABM-Covid19 is a detailed model that offers control of a range of epidemiological, social network and interventional parameters, which need to be carefully set. While much effort was made to parameterise the model using best estimates from the literature, in practice some of these parameters were very difficult to assess in real time and are hard to quantify even now that COVID-19 is becoming endemic (e.g. transmission risk by social setting, number and frequency or repeated contacts by setting). This work has given us a clearer idea of the types of data that would need to be gathered in order to run this kind of model to greatest operational effect in the future. With such a detailed model, there is always the risk of over-parameterisation, especially in such cases where it is difficult to obtain empirical estimates of parameters. There are certainly some parameter types that would have benefitted from better studies to help inform parameter choices, and this kind of additional data collection would help to make the model more reliable in future. For example, as a result of the elevated risk of hospital admission with age it was important for us to represent all elderly people in our model population, including those resident in care homes. We made some pragmatic decisions about how to parameterise contacts and transmission risk in this setting (informed by available evidence on carer to resident ratios), but this is certainly an area that would benefit from more detailed studies, further testing of our assumptions, and likely independent modelling exercises. While we note that this section of the population might therefore be at greater risk of being modelled incorrectly, we should reiterate that our modelling was used only to project and plan for hospital admissions and occupancy, and was never intended or used to set strategy for the care sector. The model presented here was also not used in isolation and was always considered alongside other modelling and sources of intelligence. Therefore we judged the risks of including this sector in our transmission model to be acceptable, and were outweighed by the benefits for planning the clinical response in hospitals. Similar concerns could be expressed around other institutional settings such as prisons; however, the prison population is relatively small, young, and isolated from other transmission networks so the sensitivity of hospital admissions to these parameters is likely to be minimal. Similarly, without real time data from a contact tracing programme, it would have been difficult to reliably parameterise contact tracing. As we did not have the information in early 2020 to parameterise this well, we chose not to use the contact tracing module for the majority of the work presented here.
The COVID-19 epidemic has served to highlight a key role for epidemiological models and modelling teams in the international response to disease threats, with modellers helping to inform urgent decisions when available evidence was limited. A range of models of COVID-19 spread have been used as the pandemic developed, including compartmental (Davies et al., 2020, Keeling et al., 2021, Knock et al., 2021) and agent-based models (Hoertel et al., 2020; Kerr et al., 2021). Compartmental models, which include the well-established SIR model and its derivatives, are routinely used for a range of infectious diseases and offer the advantages of being well established, straightforward to parameterise and computationally non-demanding; however, they do not easily account for individual variability, complex interactions, or emergent behaviours. ABMs by contrast are simulations of individual agents who interact according to a set of rules, which offer the advantages of providing a natural description of a system, giving flexibility to tune interventions and interactions between agents, and enabling investigation of emergent phenomena (Bonabeau, 2002). However, ABMs can be complex to build and code, difficult to parameterise, and computationally intensive to calibrate, run and scale. While this paper focuses on our use of a single ABM we should stress that this model was used as part of a suite of analytical tools within the NHS in England, each with their own strengths and limitations, to inform the operational healthcare response to the epidemic. Having a range of different models created a broad evidence base from which operational planning could be undertaken, considering the relative strengths and weaknesses of different approaches to address the challenges faced at different stages of the pandemic. We would hope that the response to any future epidemic would once again be informed by data and modelling, with a robust ecosystem of diverse models and data sources from across academia, citizen science, government, healthcare and industry all contributing to a well-integrated system of surveillance and decision making.
5. Conclusions and perspectives
The COVID-19 pandemic has provided stiff challenges for planning in health services in the UK and worldwide, and has served to illustrate the key role of data and data analysis in the surveillance and response to infectious disease threats. Using OpenABM-Covid19 and abmcal, and working alongside academic and industry partners, the NHS in England has been able to rapidly calibrate to up-to-date epidemic data and simulate the epidemic in populations of more than 50 million individuals matching the age, occupation and household structures of the population of England. These simulated scenarios have provided useful projections of the course of the epidemic and shared with planners and strategic decision makers, as well as uploaded to the COVID-19 Data Store where the data are used as inputs to downstream models. Scenario outputs at national, regional and subregional levels have enabled understanding of geographical variation in the course of the epidemic and facilitated planning at local system level. Owing to the flexibility of OpenABM-Covid19 the NHS has been able to adapt methods to model the changing face of the epidemic, including during Winter 2020 and the rollout of the NHS vaccination programme. These simulated epidemiological scenarios form part of a suite of analytics tools and products that have supported NHS planning and response to the epidemic including planning non-COVID-19 activity. An ABM can be a useful part of the toolkit for operationally-focussed epidemiology teams, providing a mechanism to investigate future scenarios for planning purposes. There is clearly more that can be done to improve the statistical validity of calibration approaches and operational teams should look to the latest published evidence in this area in order to improve the robustness of their work. We would encourage innovations in this space to be accompanied by open source code, good written documentation, and comprehensive tests. This work has been a highly collaborative effort, with contributions from industry, academia and the NHS, drawing on wider work from government, public bodies and university researchers. It highlights the importance of a strong interconnected data landscape and showcases the role that data will have in responding to future challenges, including infectious disease threats.
Funding
JPG's work was supported by funding from the UK Health Security Agency and the UK Department of Health and Social Care (DHSC). This work was also supported by DHSC funding awarded to CF and Li Ka Shing Foundation grant awarded to CF.
CRediT authorship contribution statement
Nick Groves-Kirkby: Concept, Data analysis, Drafting revising approval. Ewan Wakeman: Concept, Data analysis, Drafting revising approval. Seema Patel: Concept, Data analysis, Drafting revising approval. Robert Hinch: Concept, Analysis, Drafting revising approval. Tineke Poot: Concept, Data analysis, Drafting revising approval. Jonathan Pearson: Concept, Data analysis, Drafting revising approval. Lily Tang: Concept, Data, Drafting revising approval. Edward Kendall: Concept, Data, Drafting revising approval. Ming Tang: Concept, Data, Drafting revising approval. Kim Moore: Data, Analysis, Drafting revising approval. Scott Stevenson: Data analysis, Drafting approval. Bryn Mathias: Data analysis, Drafting approval. Ilya Feige: Data analysis, Drafting approval. Simon Nakach: Data analysis, Drafting approval. Laura Stevenson: Data analysis, Drafting approval. Paul O'Dwyer: Data analysis, Drafting approval. William Probert: Analysis, Drafting revising approval. Jasmina Panovska-Griffiths: Analysis, Drafting revising approval. Christophe Fraser: Concept, Analysis, Drafting revising approval.
Conflict of interest statement
The authors declare no conflict of interest.
References
- Abueg M., et al. Modeling the effect of exposure notification and non-pharmaceutical interventions on COVID-19 transmission in Washington state. npj Digit. Med. 2021;4:49. doi: 10.1038/s41746-021-00422-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Adam D. Special report: The simulations driving the world's response to COVID-19. Nature. 2020;580(7803):316–318. doi: 10.1038/d41586-020-01003-6. [DOI] [PubMed] [Google Scholar]
- Bonabeau E. Agent-based modeling: Methods and techniques for simulating human systems. PNAS. 2002;99(suppl 3):7280–7287. doi: 10.1073/pnas.082080899. May 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Colman E. et al., 2021, Estimating the proportion of SARS-CoV-2 infections ascertained through diagnostic testing. medRxiv 2021.02.09.21251411 https://doi.org/10.1101/2021.02.09.21251411.
- Davies N.G., et al. Association of tiered restrictions and a second lockdown with COVID-19 deaths and hospital admissions in England: a modelling study. Lancet Infect. Dis. 2020;VOLUME 21(ISSUE 4):P482–P492. doi: 10.1016/S1473-3099(20)30984-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Davies N.G., et al. Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England. Science. 2021;372(6538) doi: 10.1126/science.abg3055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eikenberry S.E., et al. To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic. Infect. Dis. Model. 2020;5:293–308. doi: 10.1016/j.idm.2020.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elliott, P. et al., 2021, Exponential growth, high prevalence of SARS-CoV-2, and vaccine effectiveness associated with the Delta variant. DOI: 〈https://10.1126/science.abl9551〉. [DOI] [PMC free article] [PubMed]
- Endo A., et al. Estimating the overdispersion in COVID-19 transmission using outbreak sizes outside China. [version 3; peer review: 2 approved] Wellcome Open Res. 2020;5:67. doi: 10.12688/wellcomeopenres.15842.3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferguson, N.M. et al., 2020, Report 9: Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand. Preprint at Spiral. https://doi.org/10.25561/77482.
- Ferretti L., et al. Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing. Science. 2020;368(6491) doi: 10.1126/science.abb6936. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gimma, A. et al., 2021, CoMix: Changes in social contacts as measured by the contact survey during the COVID-19 pandemic in England between March 2020 and March 2021 [preprint]. medRxiv 2021.05.28.21257973; https://doi.org/10.1101/2021.05.28.21257973.
- Gov.uk, 2021a, Coronavirus (COVID-19) in the UK. 〈https://coronavirus.data.gov.uk/details/healthcare?areaType=nation&areaName=England〉 (accessed 26 June 2022).
- Gov.uk, 2021b, SPI-M papers 〈https://www.gov.uk/search/all?parent=scientific-advisory-group-for-emergencies&keywords=spi-m-o&organisations%5B%5D=scientific-advisory-group-for-emergencies&order=relevance〉.
- Gov.uk, 2021c, COVID-19 Response – Spring 2021 (Roadmap). 〈https://www.gov.uk/government/publications/covid-19-response-spring-2021〉.
- He X., et al. Temporal dynamics in viral shedding and transmissibility of COVID-19. Nat. Med. 2020;25:672–675. doi: 10.1038/s41591-020-0869-5. [DOI] [PubMed] [Google Scholar]
- Hinch, R. et al., 2020, Effective configurations of a digital contact tracing app: a report to NHSX. 〈https://github.com/BDI-pathogens/covid-19_instant_tracing/blob/master/Report%20-%20Effective%20Configurations%20of%20a%20Digital%20Contact%20Tracing%20App.pdf〉 (accessed 26 October 2021).
- Hinch R., et al. OpenABM-Covid19—An agent-based model for non-pharmaceutical interventions against COVID-19 including contact tracing. PLoS Comput. Biol. 2021;17(7) doi: 10.1371/journal.pcbi.1009146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hoertel N., et al. A stochastic agent-based model of the SARS-CoV-2 epidemic in France. Nat. Med. 2020;volume 26:1417–1421. doi: 10.1038/s41591-020-1001-6. (pages) [DOI] [PubMed] [Google Scholar]
- Irons, N.J. & Raftery, A.E., 2021, PNAS August 3, 2021 118 (31) e2103272118; https://doi.org/10.1073/pnas.2103272118.
- Keeling et al., 2021, Predictions of COVID-19 dynamics in the UK: Short-term forecasting and analysis of potential exit strategies. PloS Comput. Biol. https://doi.org/10.1371/journal.pcbi.1008619. [DOI] [PMC free article] [PubMed]
- Kerr C.C., et al. Covasim: An agent-based model of COVID-19 dynamics and interventions. PLoS Comput. Biol. 2021;17(7) doi: 10.1371/journal.pcbi.1009149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Knock, et al. Key epidemiological drivers and impact of interventions in the 2020 SARS-CoV-2 epidemic in England. Sci. Transl. Med. 2021;13(602):eabg4262. doi: 10.1126/scitranslmed.abg4262. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kucharski A.J., et al. Early dynamics of transmission and control of COVID-19: a mathematical modelling study. Lancet Infect. Dis. 2020;20(5):553–558. doi: 10.1016/S1473-3099(20)30144-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mossong J., et al. Social contacts and mixing patterns relevant to the spread of infectious diseases. PLoS Med. 2008;5(3) doi: 10.1371/journal.pmed.0050074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- NHS Early Warning System Team, A framework for forecasting system-wide COVID-19 healthcare demand metrics for operational purposes (in preparation).
- NHS England, 2021a, NHS COVID-19 Data Store. 〈https://www.england.nhs.uk/contact-us/privacy-notice/how-we-use-your-information/covid-19-response/nhs-covid-19-data-store/〉 (accessed 26 October 2021).
- NHS England, 2021b, OpenSAFELY – the Coronavirus (COVID-19) Research Platform. 〈https://www.england.nhs.uk/contact-us/privacy-notice/how-we-use-your-information/covid-19-response/coronavirus-covid-19-research-platform/〉 (accessed 26 October 2021).
- Office for National Statistics, 2021a, Population estimates for the UK, England and Wales, Scotland and Northern Ireland: mid-2020. 〈https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/bulletins/annualmidyearpopulationestimates/mid2020〉.
- Office for National Statistics, 2021b, Covid-19 Infection Survey. 〈https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/bulletins/coronaviruscovid19infectionsurveypilot/latest〉.
- ZOE COVID Study, 2021. 〈https://covid.joinzoe.com/〉.








