Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2024 Dec 1.
Published in final edited form as: Lancet Infect Dis. 2024 Aug 7;24(12):e762–e773. doi: 10.1016/S1473-3099(24)00374-8

Ebola Virus Disease mathematical models and epidemiological parameters: a systematic review

Rebecca K Nash 1,*, Sangeeta Bhatia 1,2,3,*, Christian Morgenstern 1, Patrick Doohan 1, David Jorgensen 1, Kelly McCain 1, Ruth McCabe 1,4,5, Dariya Nikitin 1, Alpha Forna 1,6, Gina Cuomo-Dannenburg 1, Joseph T Hicks 1, Richard J Sheppard 1, Tristan Naidoo 1, Sabine van Elsland 1, Cyril Geismar 1, Thomas Rawson 1, Sequoia Iris Leuba 1, Jack Wardle 1, Isobel Routledge 7, Keith Fraser 1; Pathogen Epidemiology Review Group , Natsuko Imai-Eaton 1, Anne Cori 1,2, H Juliette T Unwin 8,1,#
PMCID: PMC7616620  EMSID: EMS197979  PMID: 39127058

Summary

Ebola Virus Disease (EVD) poses a recurring risk to human health. We conducted a systematic review (PROSPERO CRD42023393345) of EVD transmission models and parameters published prior to 7th July 2023 from PubMed and Web of Science. Two people screened each abstract and full text. Papers were extracted using a bespoke Access database, 10% were double extracted. We extracted 1,280 parameters and 295 models from 522 papers. Basic reproduction number estimates were highly variable as were effective reproduction numbers, likely reflecting spatiotemporal variability in interventions. Random effect estimates were 15.4 days (95% Confidence Interval (CI) 13.2-17.5) for the serial interval, 8.5 (95% CI 7.7-9.2) for the incubation period, 9.3 (95% CI 8.5-10.1) for the symptom-onset-to-death delay and 13.0 (95% CI 10.4-15.7) for symptom-onset-to-recovery. Common effect estimates were similar albeit with narrower CIs. Case fatality ratio estimates were generally high but highly variable, which could reflect heterogeneity in underlying risk factors. While a significant body of literature exists on EVD models and epidemiological parameter estimates, many of these studies focus on the West African Ebola epidemic and are primarily associated with Zaire Ebola virus. This leaves a critical gap in our knowledge regarding other Ebola virus species and outbreak contexts.

Keywords: Ebola virus disease, systematic review, epidemiological parameters, mathematical models

Introduction

The COVID-19 pandemic, and multiple recent outbreaks of re-emerging pathogens, have highlighted the tremendous threat of infectious pathogens to the human population. The Democratic Republic of Congo (DRC) experienced an outbreak of Ebola virus disease (EVD) from 2018-2020. Seven other EVD outbreaks in DRC, Guinea and Uganda have been declared since. In Spring 2022, an outbreak of mpox affected several countries beyond the known endemic region (1). In February-March 2023, Equatorial Guinea and Tanzania faced a Marburg Virus Disease outbreak, the first since 2014 (2). These recurring events reinforce the need for pandemic preparedness. The World Health Organisation (WHO) has listed Ebola virus (EV) as one of nine pathogens posing the greatest threat to public health due to its high epidemic potential and lack of sufficient countermeasures (3).

EV is a deadly filovirus (4), transmitted through close contact and bodily fluids especially during traditional burials and caregiving, which has caused 38 known outbreaks since its discovery in 1976 (appendix pp 4-5)(5). Most of these outbreaks have occurred in Central and Western Africa, and the largest epidemic, the West African (WA) Ebola epidemic, caused over 11,000 reported deaths between 2013 and 2016 mainly across Guinea, Liberia and Sierra Leone (6,7). Four species of EV are known to affect humans: Zaire, Bundibugyo, Sudan and Taï Forest. One species, Reston, is only known to cause disease in non-human primates (4) and a sixth species, Bombali, was identified in samples taken from bats in Sierra Leone (8). The symptoms of Ebola infection can be sudden and include fever, fatigue, muscle pain, headache and sore throat followed by vomiting, diarrhoea, rash, and internal and external bleeding.

Treatment of EVD involves supportive care, such as rehydration (intravenous fluids or oral rehydration) and the stabilisation of oxygen levels and blood pressure. There are two monoclonal antibody treatments recommended for confirmed cases of infection with the Zaire species, REGN-EB3 and mAb114, but access is limited due to uncertainties surrounding pricing and future supply (9). There are no licensed treatments for other EVD species. A vaccine (Ervebo), trialled against the Zaire species during the WA epidemic (10), is now used as part of outbreak response activities to prevent disease using a “ring vaccination” strategy. An alternative vaccine (Zabdeno/Mvabea) consists of two doses given eight weeks apart and is therefore not suitable for use in an emergency context where immediate protection is necessary (11). Three vaccine candidates for the Sudan species are in various trial phases (12). In the absence of widely available vaccines and therapeutics, mitigation of EVD outbreaks relies on a suite of public health and social measures such as case identification and isolation, contact tracing and quarantine, personal protective equipment for health care workers, safe and dignified burials, and community engagement.

Mathematical modelling and outbreak analytics are a component of monitoring and responding to epidemics and have been used effectively during previous outbreaks. Examples include estimating the severity, epidemic delays and projections during the WA epidemic (13,14), bed capacity in Sierra Leone during the WA epidemic (15), severity, epidemic delays and reproduction number estimates in DRC in 2018 (16) and making disease forecasts during the 2018-2019 DRC outbreak (17). Mathematical epidemic models typically use various parameters characterising the pathogen as inputs, for example the incubation and infectious periods, with the robustness of modelling outputs directly depending on the reliability of parameter values. Further, these parameters have direct implications for outbreak control; for example, the upper bound of the incubation period determines the necessary duration of follow-up for contacts of cases (13). Therefore, it is important to centralise evidence around these parameters, to inform the rapid design of mathematical models that could support the response to future EVD outbreaks. Here we undertake a systematic review and meta-analysis to build a database of EVD models and related parameters.

Methods

PRISMA guidelines were used for this systematic review and checklists have been included in appendix pp 11-13.

Search strategy and selection criteria

We searched PubMed and Web of Science databases for English peer-reviewed articles including EVD transmission models, parameters characterising EVD transmission, evolution, natural history, severity and seroprevalence, and risk factors published prior to 7th July 2023 (appendix p 5). Each title and abstract and then full text were screened by two reviewers from a group of 15 (RKN, SB, CM, PD, DJ, KM, RM, AF, GC-D, JH, TN, IR, SvE, AC and HJTU) using Covidence (18) and inclusion / exclusion criteria from appendix pp 5-6. Disagreements were resolved by consensus between reviewers. We used backward citation chaining from 12 of the 179 review papers (1930) identified during screening to add in missing papers (see appendix p 5).

19 reviewers (RKN, SB, CM, PD, KM, RM, DN, GC-D, JH, RS, TN, SvE, CG, TR, SIL, JW, KF, AC and HJTU) extracted data about the articles, models, and parameters from our included papers using Microsoft Access (see appendix pp 6-8 for full list of parameters and extraction information). 4 reviewers (RN, SB, AC and HJTU) extracted further data about the models from the 37 papers that mentioned publicly available code to identify models that could in theory be re-used in future outbreaks (appendix p 8).

To ensure consistency of the data extraction process, data from 55 randomly selected papers (10%) were double extracted, with disagreements resolved by consensus. We used a customised questionnaire to assess the quality and risk of bias of each paper (appendix pp 8-9). Data extractors responded to each question with “Yes,” “No,” or “NA” if the question was not applicable to the article. For each article, the quality score was determined as the ratio of “Yes” answers to the total number of applicable questions answered for that article.

Data analysis

All analysis was done in R using the orderly2 R package (31) for workflow management (see appendix 9-10 for full details). Curated and annotated data are made available in our R package epireview (32). We conducted meta-analyses for incubation periods, serial intervals and time from symptom onset to death or recovery, where the number of estimates exceeded at least two, using the metamean function from the meta R package (33). This returns two estimates, one from a common-effects and one from a random-effects model. The common-effects model assumes that all effect sizes come from a single, homogeneous population, whereas the random-effects model assumes that the estimates from different studies arise from an underlying distribution. While both common- and random-effects models incorporate variance in estimates due to sampling error, the random-effects model posits an additional source of variance and therefore addresses the heterogeneity in estimates across studies (34). We use the I2 statistic to estimate the percentage of total variability in effect sizes that is due to between-study heterogeneity (35). The metamean function was also used to perform sub-group meta-analyses to explore whether these delays varied by EV species. We did not do meta-analyses for other parameters due to too few data in the required format (see appendix pp 9-10), or high variability between the study settings.

Main text figures and tables only include parameters from articles with a high quality assessment (QA) score of at least 50% (appendix pp 8-9), and all parameters are included in the SM. Our full analysis can be reproduced using https://github.com/mrc-ide/priority-pathogens.

Role of the funding source

The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report.

Results

Our search returned 24,338 articles, which reduced to 14,690 after deduplication and the addition of two papers identified through other systematic reviews (Figure 1). Following title and abstract screening, 1,674 articles were retained for full text screening, with 522 meeting our inclusion criteria. We extracted 1,280 parameters (from 354 articles) (appendix p 15) and 295 models (from 280 articles). We could link 1,213 of the parameters to a specific EVD outbreak; the vast majority of these (71%, n=858) reported on the WA Ebola epidemic. 1,229 parameters could be linked to a specific EV species and 92% (n=1,136) were associated with the Zaire species.

Figure 1. PRISMA flowchart illustrating the systematic review process.

Figure 1

Published EVD seroprevalence estimates were highest in countries with reported outbreaks (appendix pp 4-5, 16), but varied depending on the population sampled. In the DRC, community-based seropositivity was between 0 and 18.7% (number of studies n=5, number of estimates m=5) across the different assays, whereas in hospitals it was between 0 and 37.0% (n=5, m=8). Similarly, in Guinea, a single population-level estimate (n=1, m=1) was 0.07% but ranged from 59.4%-99.8% in hospitals. Despite no official outbreaks, seropositivity has been reported among population groups in Cameroon, Central African Republic, Kenya, Madagascar, Mali, Tanzania and two multi-country studies (n=10, m=12) (appendix p 17).

Across n=8 articles reporting attack rates (appendix p 19), m=9 central estimates were all below 10%, except for one paper (36) which suggested that up to 31% of physicians had been infected in the 1995 outbreak in Kikwit, DRC. Most estimates (7/10) focused on the general population with generally low uncertainty.

Published estimates of the basic reproduction number (R0) were highly variable (appendix pp 21-23). After removing papers with low QA scores (<50%), m=71 R0 central estimates across n=52 studies ranged from 0.05-12 (Figure 2 and Table 1A). 82% (m=58) of R0 estimates, including the most extreme central estimates, were for the 2013-16 WA epidemic; across all other outbreaks, R0 central estimates ranged from 1.1-7.7 (both for the 2018-20 DRC outbreak). Uncertainty around WA central estimates of R0 was also highly variable, with lower bounds of 0 and upper bounds of 18.5. 97% of R0 estimates (m=69) were for the Zaire species and two were for the Sudan species (from outbreaks in Uganda from 2000-01 and 2022-23). Sudan R0 central estimates were less variable than Zaire estimates (range 2.0 to 2.7).

Figure 2. Basic reproduction numbers (R0) by outbreak.

Figure 2

Each panel corresponds to a different outbreak of EVD with the associated EV species in brackets. Points represent central estimates, with symbol shapes corresponding to central value type. Thick coloured shaded lines represent the range of central estimates when R0 was estimated, for example, across regions or over time. Solid coloured bars represent the uncertainty around the central estimate; this was reported in different formats including standard deviation (in which case the bar represents +/- the standard deviation), 95% highest posterior density interval, range, interquartile range, 95% CrI (credible interval) and 95% CI. The x-axis has been restricted to a maximum of 10 for clarity. All parameters are from articles with a QA score of >= 50% (see appendix pp 21-23 for all R0 estimates).

Table 1. Ranges of estimates for A) basic reproduction numbers by outbreak and estimate type, B) epidemiological delays by Ebola virus species, C) Case Fatality Ratios (CFRs) by outbreak and country.

The total column specifies the number of parameters (QA filtered >=50%) included in the summary range. Some parameter entries provide aggregated ranges of central estimates e.g. across time and countries (see appendix pp 6-8) when more than three values were provided. Additionally, not all central estimates were reported with an associated uncertainty interval.

A) Basic Reproduction Numbers
Outbreak Estimate Type Central Estimate Range Uncertainty Range Total
DRC, 1976
Median 1.34 - 4.71 3.92 - 5.66 1
ORC, 1995
Mean 1.4 - 2.98 1.9 - 4.36 3
Median 1.93 1.74 - 2.78 1
Other/Unspecified 1.8 1
Uganda, 2000-2001
Mean 2.7 2.5 - 4.1 1
West Africa 2013-2016
Mean 0.29 - 10 0 - 18.5 17
Median 0.6 - 5.9 1.1 - 3.55 14
Other/Unspecified 0.05 - 12 0.2 - 15.55 27
DRC, 2014
Other/Unspecified 5.15 3.95 - 6.69 1
DRC, 2018-2020
Other/Unspecified 1.08 - 7.73 4
Uganda. 2022-2023
Other/Unspecified 1.99 - 2.7 1
B) Epidemiological Delays
Delay Species Central Estimate Range Uncertainty Range Total
Generation time
Zaire 10.9 9.2 - 13 1
Incubation period
Bundibugyo 5.7 - 11.3 2 - 27 3
Sudan 6 - 14 1 - 16 3
Zaire 0.1 - 18 0 - 33 30
Infectious period
Zaire 1.7 - 29.6 1.2 - 10.8 12
Latent period
Zaire 0.1 - 31.2 0.1 - 71 7
Serial interval
Sudan 11.7 - 12 8.2 - 15.8 2
Zaire 7 - 19.4 2 - 25 15
Symptom onset to death
Bundibugyo 9 - 11.4 3 - 21 3
Sudan 3 - 11 3 - 15 2
Zaire 1 - 14 0 - 27 28
Symptom onset to recovery/non-infectiousness
Bundibugyo 9.9 - 10 2 - 26 2
Zaire 8.4 - 14 7.5 - 20 5
C) Case Fatality Ratios
Outbreak Country Central Estimate Range Uncertainty Range Total
DRC, 1976
DRC 88 - 96 80 - 98 2
South Sudan, 1976
Sudan 35 - 67 1
Gabon, 1994
Gabon 59 1
DRC, 1995
DRC 69 - 96 73 - 83 6
Gabon, 1996
Gabon 68 - 75 2
Uganda, 2000-2001
Uganda 29 - 77 1
DRC, 2007
DRC 74 68 - 79 1
Uganda, 2007
Uganda 25 - 100 4
DRC, 2008-2009
DRC 44 26 - 62 1
DRC, 2012
DRC 0 - 100 39 - 84 3
West Africa 2013-2016
Guinea 23 - 84 31 - 78 14
Guinea, Liberia, Nigeria, Sierra Leone 32 - 84 69 - 73 1
Guinea, Liberia, Sierra Leone 19 - 100 46 - 86 8
Liberia 9 - 93 49 - 74 7
Liberia, Sierra Leone 38 - 93 6
Multi-country: Europe & USA(n = 9) 18 1
Nigeria 22 - 46 14 - 71 3
Sierra Leone 0 - 100 49 - 99 34
DRC, 2014
DRC 48 - 68 2
DRC, 2017
DRC 50 1
DRC, 2018
DRC 36 - 72 36 - 72 2
DRC, 2018-2020
DRC 23 - 80 7
Uganda 71 - 100 2
Multiple outbreaks
DRC 44 - 96 76 - 82 1

Estimates of the effective reproduction number, Reff, which measures transmissibility in the presence of potential interventions and population immunity, were also highly variable (appendix pp 24-25). In n=23 high QA studies (>=50%), central estimates ranged from 0-9 across m=32 parameter estimates, of which all were for EV Zaire. The majority (72%, m=23) were for the WA epidemic, and the rest were all from the DRC, with only one estimate prior to the WA epidemic (0.73 in the 1995 DRC outbreak).

Estimates of secondary attack rates (n=12, m=15) (SAR) (appendix p 26), growth rates (n=6, m=13) (appendix p 27) and doubling times (n=4, m=6) (appendix p 28) were similarly heterogeneous across studies, with central estimates of SAR in the range 0.1-89% (m=8 for household based studies, m=3 for community based and m=3 for contact based), daily growth rates between 0.0 and 1.4 and doubling times up to decades.

We extracted m=14 estimates of overdispersion from n=11 studies, which all used a Negative binomial distribution to characterise the offspring distribution. Central overdispersion estimates ranged from 0.02-2.2, with lower values indicating more overdispersion (appendix p 30). Other than one entry with an unspecified EV species, all were for Zaire (m=13), and most (86%, m=12) were for the WA Ebola epidemic.

We identified n=31 studies that examined risk factors for EV infection (appendix p 31). Conflicting findings were found across studies, with most risk factors found to be both non-significant and significant. Risk factors such as contact with an infected individual (close contact with the individual or their bodily fluids, household contact and non-household contact), participation in funerals (either attendance or involvement in unsafe burial practices), age, occupation, and hospitalisation were most frequently found to be significant. However, sex was more commonly found to be non-significant (n=12 analyses) than significant (n=3).

Risk factors for seropositivity were similar to those investigated for infection and showed comparable conflicting findings across studies (appendix pp 18, 31). Contact with an animal was most commonly found to be significant. Age, sex, occupation, close contact, household and non-household contact, were most frequently non-significant. Participation in funerals was found equally significant and non-significant. Two studies analysed risk factors for onwards transmission; significant risk factors included age, sex, socioeconomic status, survival, funeral, and being part of the first generation of a transmission chain (appendix p 31).

We extracted m=21 estimates of the serial interval (SI) were reported from n=17 studies (m=19 for Zaire species, m=2 for Sudan), and a single generation time estimate was reported (for Zaire, Table 1B and appendix p 32). The mean SI estimate for high QA studies (n=m=6, all Zaire species) was 16.5 days (95% Confidence Interval (CI) 16.1-16.9) for common effect and 15.4 days (95% CI 13.2-17.5) for random effect (Figure 3A). The high value of I2=94% is indicative of high levels of heterogeneity in the SI, suggesting that the estimates from the common effect might be over precise and the random effect estimates are more reliable. Estimates including three additional low QA studies were similar (appendix p37).

Figure 3. Meta-analysis for the mean A) serial interval, B) incubation period, C) time from symptom onset to death, and D) time from symptom onset to recovery.

Figure 3

All included studies have a QA score of >=50%. Parameters used in the meta-analyses are paired mean and standard deviation of the sample or were converted into mean and SD of the sample from the following combinations: mean and standard error, median and interquartile range, or median and range (see appendix pp 9-10). Blue squares are mean estimates from each study with 95% confidence intervals. Diamonds represent the overall mean across studies for the common and random effect models. The random effect model accounts for within-study and between-study variance.

We identified m=52 estimates for the incubation period from n=43 studies, m=11 estimates of the latent period from n=10 studies, and m=27 estimates of the infectious period from n=23 studies (appendix p 32). Excluding papers with low QA scores (<50%), all estimates of the incubation period were for Ebola Zaire (m=30) except for 6 estimates (m=3 Bundibugyo and m=3 Sudan) with central estimates for all species ranging between 0.1-18. Central estimates ranged from 0.1-31.2 days for the latent period and from 1.7-29.6 days for the infectious period (Table 1B). There were too few high QA latent and infectious period estimates (m=1 for each) in the required format to perform meta-analyses (see Methods and appendix pp 9-10). The pooled mean incubation period estimate, based on n=9 studies, was 8.4 days (95% CI 8.0-8.8) for total common effect and 8.5 days (95% CI 7.7-9.2) for total random effect (Figure 3B). Despite sparse estimates for species other than Zaire in the analysis (m=3), there are statistically significant differences (p=0.02) in mean incubation periods between EV species (with mean incubation period ranging between 4.77-8.36 days for the random-effect model) (appendix 36). After including all infectious period estimates (regardless of QA score; m=3), the pooled mean was 5.4 days (95% CI 5.3-5.5) for common effect and 5.0 days (95% CI 3.7-6.3) for random effect (appendix p 37). Again, the high I2=100% suggests the random effects model is more reliable.

Clinical progression (e.g. symptom onset to death or recovery) and case management (e.g. hospitalisation) are also key modelling inputs. We extracted estimates for delays from symptom onset to test, test result, diagnosis, reporting (or World Health Organisation (WHO) notification), seeking care, admission to care, quarantine, recovery, negative test or undetectable viral load, discharge from care, and death (appendix p 33). We also extracted estimates for delays from admission to care, to discharge, recovery, and/or death (appendix p 34). We found delays in the clinical timeline to be highly variable across contexts. Across all studies (including those with low QA, n=7), the central estimate of symptom onset to reporting varied between 0-25.7 days (m=15), although most (m=13) estimates were around 1/2-2 weeks (appendix p 33). Similarly, the central symptom onset to discharge from care delay varied between 6.3 and 28 days (n=11, m=14) (appendix p 33). Time in care was also highly variable irrespective of the outcome but tended to be shorter for those dying: the central delay from admission to care to death was in the range 0-11 days (n=26, m=33) versus 2.6-17 days for admission to recovery (n=10, m=12) (appendix p 34).

In contrast, reported delays from symptom onset to admission to care were remarkably consistent. Central estimates varied between 0 and 23 days for high QA studies (n=39, m=47) (appendix p 33), but most (44/47) central estimates fell between 3-6 days. We extracted m=14 estimates from n=12 studies for the delay from symptom onset to recovery, with central estimates of those with high QA (m=7) ranging from 8.4-14.0 days (Table 1B and appendix p 33). Based on four estimates from four high QA studies, the pooled mean estimate for the time from symptom onset to recovery was 12.6 days (95% CI 11.7-13.4), for total common effect and 13.0 days (95% CI 10.4-15.7) for total random effect (Figure 3D). Again I2 = 91% was high suggesting the random effects model is more reliable. We extracted m=48 estimates from n=39 studies for the delay from symptom onset to death, with a pooled mean estimate across m=16 suitable estimates from n=13 high QA studies of 9.0 days (95% CI 8.7-9.2) for total common effect and 9.3 days (95% CI 8.5-10.1) for total random effect (Figure 3C). Species sub-group meta-analyses using random effects models indicate that the time from symptom onset to death may be longer (p=0.01), and symptom onset to recovery may be shorter (p<0.01), for those infected with the Bundibugyo species compared to Zaire (appendix p 36).

We extracted m=166 estimates of case fatality ratios (CFR) were extracted from n=130 papers. Early estimates from the DRC in 1976 and 1995 suggest a CFR of greater than 69% (Table 1C, Figure 4, appendix pp 39-42). However, more recent estimates from the WA epidemic (2013-16) and DRC (2018-2020) have lower central values ranging from 18.5% to 93.5% but with wide uncertainties in some settings. In general, estimates of CFR are highly context-specific and factors such as study design, location of outbreak, whether estimates are generated post-outbreak have a strong influence them. Because of the varied settings and methods that were used to generate the extracted estimates, we determined that a meta-analysis was not a feasible approach for synthesis in this case. We therefore used naive measures (crude and weighted mean) to summarise CFR estimates. The mean CFR across all estimates was 57.8%. For the 112 (of 166) estimates where we also had information on the sample size, the mean CFR weighted by the population size was 64.6%. Age, sex and occupation were frequently investigated as potential risk factors for death in both multivariate and univariate analyses (appendix p 43). Age was found to be significantly associated with death in 41/68 parameter entries, whereas most analyses did not find a significant association between death and sex or occupation.

Figure 4. Case Fatality Ratio (CFR) estimates (%) across outbreaks.

Figure 4

Each panel corresponds to a different EVD outbreak, with the associated EV species in brackets. Points represent central estimates, with symbol shapes corresponding to analysis type: adjusted, naïve or unspecified. Thick coloured shaded lines represent the range of central estimates when the CFR was estimated, for example, across regions or over time. Solid coloured bars represent the uncertainty around the central estimates, reported in different formats including 95% CrIs and 95% CIs. All parameters are from articles with QA scores of >= 50% (see appendix pp 39-42).

We extracted m=24 estimates of mutation, substitution and evolutionary rates from n=20 studies (appendix p 44). Central estimates ranged from 0.36 x 10-4-36 x 10-4 substitutions per site per year. The upper bound was an estimate from a single outbreak in DRC (37). In contrast, the same study estimated a much lower substitution rate of 6.9 x 10-4 across multiple countries and outbreaks from 1976-2018. Apart from three estimates for all species (Bundibugyo, Sudan, Zaire and Taï forest) (38,39) and a single estimate for Bundibugyo alone (40), all estimates were for Zaire (one study did not specify the species).

Finally, we extracted existing models from n=280 studies (appendix pp 46-52). Most of the published EVD models (m=295) were compartmental (m=210), followed by branching process (m=19) and agent-based models (m=17) (appendix p 45). There were also 49 other model types or combinations of models. Various assumptions were made in the models including homogenous mixing, heterogeneity in transmission rates between groups or over time and the latent period being the same as the incubation period (appendix p 45). Despite this wealth of knowledge, only 13% of models (m=37) mention publicly available code of which we could access 62% (m=23), limiting reusability. Even where we could access the associated code, only 61% (m=14) of models had accompanying instructions or a README on how to use the code. Of models with available code (appendix 53), 57% (m=13) addressed research questions about transmission, 17% (m=4) focused on method developments and 13% (m=3) generated forecasts. In addition, 30% (m=7) were spatial and none considered spillover. 87% (m=20) of the models were fitted to observed data, using Markov Chain Monte Carlo methods (43%, m=10) and Maximum Likelihood estimation (22%, m=5). Most of the models were coded in R (65%, m=15) followed by MATLAB (13%, m=3).

Discussion

This systematic review presents a comprehensive set of epidemiological parameters and mathematical models for EVD. Most estimates of epidemiological delays were highly variable across studies, which agrees with previous reviews (21,22). This heterogeneity is likely driven by differences in epidemic context, with time to hospitalisation often being used as a marker of response performance (13,14). We found significant differences between species for incubation periods and time from symptom onset to recovery and death, although there was much more evidence for Zaire than for other species. We note the frequent inconsistency in the definition of endpoints, for example reporting versus WHO notification, or recovery versus testing negative, made comparison of epidemiological delays across studies challenging.

Despite the plethora of published evidence, we chose not to summarise all parameters through meta-analysis because there were too few estimates or the definitions and contexts across which parameters have been estimated varied. For example, seroprevalence estimates varied dramatically depending on the population groups being sampled. Seropositivity was very high when looking at serology in people who had past infections e.g. (41) compared to the general population e.g. (42). Differences in R0 values may be driven by more than just the epidemic context e.g. some of the early epidemics with high R0 estimates were driven mostly by nosocomial transmission (43,44) or the methodological approach taken differed (e.g. branching processes e.g. (13) or using phylogenies e.g (45)). Uncertainty in R0 is especially high in the Nigeria WA epidemic context, likely because it was a small outbreak (46). Estimates of seroprevalence (25,27,30) secondary attack rates (20), and R0 (21)are in line with previous reviews, although we did not focus on specific settings so there is greater variability. In particular, we note the setting for the secondary attack rate being important to explain heterogeneities, with estimates from studies with contact and household settings being higher than community based.

CFR estimates also varied greatly between and within outbreaks. This could be due to differences in both the resilience of healthcare systems, patient demographics and conflict (13,47,48). The range in our estimates are consistent with these previous reviews (21,2730). However, there is insufficient evidence to distinguish whether one or more species have an inherently higher severity, or whether the observed differences are driven by contextual factors such as improved case management over time. Summarising parameter estimates that have been generated using widely different methods and relate to varied contexts presents an important statistical problem. Nevertheless, a reasonable synthesis of the plethora of information available from previous outbreaks can provide a useful starting point during a new outbreak. Here we have presented both an average CFR and a weighted average, weighted by the sample size informing the estimate. In line with previous findings,(49) we found CFR estimates to be higher in studies with small sample size. We therefore suggest the use of weighted average CFR in modelling studies with sensitivity analyses over a broad range for robustness. Future research could investigate improved methods to synthesise highly heterogenous estimates such as those presented here.

We also extracted parameters not considered in previous systematic reviews. We found substantial but uncertain levels of overdispersion ranging from 0.02 to 2.2. Evidence from Lloyd-Smith et al. (50) would suggest that this corresponds to between ~30-90% of transmission being attributed to the most infectious 20% of individuals, which is less over dispersive than Severe Acute Respiratory Syndrome (SARS) but highly uncertain. The relatively high mutation rate of the Ebola virus confirms that genomic data may be an important asset to characterise the transmission dynamics in future outbreaks e.g. by reconstructing who infected whom (51). However, like most parameters there is high variability and uncertainty, and most evolutionary estimates are for the Zaire species and the WA epidemic context. Some of the variability in evolutionary rates are likely due to differences in substitution rates between outbreaks compared to within outbreaks (52).

Our study has several limitations. First, due to the extensive literature published on EVD, we restricted our review to published peer-reviewed studies in English. Second, although we attempted to ensure consistency in data extraction by double extracting 55 of the 522 papers, inconsistencies across extractors or incompleteness of data extraction from studies with multiple parameters are possible. Additionally, due to the subjectivity of quality assessment, and the scoring of papers as a whole rather than by individual parameter, consensus among reviewers was difficult to achieve. Third, substantial heterogeneity in the way estimates were reported sometimes made direct comparisons between studies challenging. Often, studies did not distinguish whether uncertainty pertained to the sample or the sample mean, which impacted our ability to include them in meta-analyses. Fourth, specificity of the different seroprevalence tests was often not mentioned and historic papers did not always specify the assay used, making comparisons challenging. Fifth, the large range of reported evolutionary rates may reflect our data extraction method; in particular we did not differentiate between studies using samples solely from humans and those including some samples from animals, nor the method used for estimation or sampling. Finally, we did not extract odds ratios characterising risk factors, nor the direction of protection or risk, and acknowledge that significance classification is somewhat arbitrary and dependent on study design. We encourage readers specifically interested in risk factors to investigate these papers further, for example by accessing the data from this review through the epireview R package (32).

Being prepared at the onset of an infectious disease outbreak is imperative to mount a rapid and effective response to combat the spread of disease. Here we present, synthesise, and analyse the breadth of evidence on the transmissibility, severity, delays, risk factors, mutation rates and seroprevalence of Ebola virus, as well as identify transmission models for EVD, expanding on previous modelling reviews (23,26). We curated 1,280 parameter estimates and 295 model descriptions from 522 studies and make our data available through an easy-to-use R package, epireview (32). We expect that this comprehensive repository will serve as an important resource for modellers and public health community, who can also add to this dynamic database as and when more evidence becomes available, ensuring that this database provides a live picture of the latest evidence on EVD. Much is already known about the Ebola Zaire species; however, our review highlights a critical lack of evidence for other species such as Sudan, Bundibugyo and Taï forest, which is important given that the most recent EVD outbreak was of the Sudan species. Initial analyses suggest statistically significant differences in key parameters between species such as the incubation period and delays from onset to death or recovery. We suggest analysing any data collected during the most recent Ugandan outbreak along with preparing countries to collect and analyse data from future outbreaks for wider research could help fill this knowledge gap. Finally, we note the paucity of publicly available source code for EVD models (8%); publicly releasing the source code and instructions for future models will increase usability of existing models in real-time settings.

Supplementary Material

Supplementary Material

Key Points.

  • We provide an overview of transmission dynamics of Ebola Virus Disease from PubMed and Web of Science from database inception to 7th July 2023 using 522 papers.

  • We present meta-analyses for delay parameters but high variability between settings made similar analyses infeasible for other parameters.

  • We extracted 295 models of Ebola Virus Disease transmission but only 37 (13%) mentioned associated publicly available code with 23 (8%) actually accessible.

  • There is a vast literature about Ebola Virus Disease but predominantly data is from the West African Ebola Outbreak so is Zaire species. There is a lack of information about how this differs to the other species.

  • All data are published alongside this article in the R package epireview.

Research in Context.

Evidence before this study

We searched Web of Science and PubMed up to 7th July 2023 using the search terms: Ebola, epidemiology, outbreaks, models, transmissibility, severity, delays, risk factors, mutation rates and seroprevalence. We identified 179 reviews or overviews of different aspects of Ebola virus disease (EVD) transmission, of which we explored 11 that had “systematic” or “meta” in the title plus one included by expert recommendation. Five reviews focused on case fatality ratios, with estimates ranging between 34-42% for the Bundibugyo Ebola virus species, 53-69% for the Sudan species, 31.6-100% for the Zaire species, and pooled estimates ranging between 28-65% from reviews not specifying the species. Three reviews estimated seroprevalence to be between 3.3-8% depending on the setting and time. Three reviews investigated risk factors and found that caring for a case in the community and participation in traditional funeral rites are strongly associated with acquiring disease. Two reviews reported the incubation period to be 6.3 days for the Bundibugyo species, a range of 3.35-14 days for the Sudan species, and a range of 9-11.4 days across studies on the Zaire species. We found one review considering each of the following: basic reproduction number (1.34–2.7 for Sudan species and 1.8 for Zaire species), serial interval (15-15.3 days for Zaire species), latent period (11.75 days for a combination of Zaire and unspecified species), and secondary attack rates (12.5%, species unspecified). Two reviews consider transmission models, identifying that it is difficult to accurately model the impact of time-dependent changing factors without high quality data, and data are often missing, complicating proper parameterisation of the underlying transmission mechanisms. One specific review looked at the Sudan EVD in response to the outbreak in Uganda in 2023, which highlighted the lack of vaccines and treatment available for this species.

Added value of this study

We provide a comprehensive summary of all available peer reviewed literature of transmission models and the variables needed to parameterise them across all EVD species and outbreaks. Our study synthesises all available analyses until 2023 and additionally considers attack rates, overdispersion and mutation rates. We give updated random effects meta-analyses of incubation periods, serial intervals, symptom onset to death and symptom onset to recovery and, where possible, provide species-specific estimates in the Supplementary Material. We also provide ranges for the basic reproduction number and case fatality ratios without running meta-analyses because these are very setting dependent. We identify that most evidence (92%) is for the Zaire species and highlight that there are knowledge gaps for other species, which should be explored in the future. All our data is held within a bespoke open-source R package to enable others to use this information easily during their model building and updates.

Implications of all the available evidence

Previous outbreaks of infectious pathogens, including the 2013-2016 West African EVD epidemic, emphasise the usefulness of computational modelling in assessing epidemic dynamics and the impact of mitigation strategies. Our study provides an updated and broader overview of all the necessary information for designing and parameterising mathematical models for use in future outbreaks of EVD, including a centralised database for other researchers to use and contribute data to.

Funding

All authors acknowledge funding from the Medical Research Council (MRC) Centre for Global Infectious Disease Analysis (MR/X020258/1) funded by the UK MRC and carried out in the frame of the Global Health EDCTP3 Joint Undertaking supported by the European Union; the National Institute for Health Research (NIHR) for support for the Health Research Protection Unit for Modelling and Health Economics, a partnership between UK Health Security Agency, Imperial College London and London School of Hygiene & Tropical Medicine (LSHTM) (grant code NIHR200908). Additional individual funding sources: AC was supported by the AMS Springboard scheme (reference SBF005\1044).CM acknowledges the Schmidt Foundation for research funding (grant code 6-22-63345); PD, TN acknowledges funding by Community Jameel; DJ acknowledges funding from the Wellcome Trust and Royal Society (216427/Z/19/Z) and PhD funding from EPSRC; GCD acknowledges funding from the Royal Society; RM acknowledges the NIHR Health Protection Research Unit (HPRU) in Emerging and Zoonotic Infections, a partnership between PHE, University of Oxford, University of Liverpool and Liverpool School of Tropical Medicine (grant code NIHR200907); JW acknowledges research funding from the Wellcome Trust (grant 102169/Z/13/Z); RKN and DN acknowledges research funding from the Medical Research Council (MRC) Doctoral Training Partnership (grant MR/N014103/1); KM acknowledges research funding from the Imperial College President’s PhD Scholarship; AF acknowledges funding from the commonwealth scholarship commission; KF acknowledges funding from the Bill and Melinda Gates Foundation, Gavi and the Wellcome Trust, and HJTU acknowledges funding from the Moderna Charitable Foundation. For the purpose of open access, the authors have applied a ‘Creative Commons Attribution’ (CC BY) licence to any Author Accepted Manuscript version arising from this submission.

Declarations

Authors’ contributions

SB, SvE, NI-E and AC, conceptualised this systematic review. RKN, SB, CM, DJ, AF, SvE, AC and HJTU searched the literature and screened the titles and abstracts. RKN, SB, CM, PD, DJ, KM, RM, AF, GC-D, JTH, TN, IR, AC and HJTU reviewed all full-text articles. RKN, SB, CM, PD, KM, RM, DN, GC-D, JTH, RS, TN, SvE, CG, TR, SIL, JW, KF, AC and HJTU extracted the data. RKN, SB and HJTU did formal analysis of, visualised, and validated the data. RKN and SB were responsible for software infrastructure. AC acquired funding. RKN, SB, GC-D, SvE, NI-E, AC and HJTU were responsible for project administration. RKN, PD, GC-D, DN, SvE and HJTU were responsible for training individuals on and accessing Covidence and designing the Access system. HJTU supervised the systematic review. RKN, SB, AC, and HJTU wrote the original draft of the manuscript. All authors were responsible for the methodology and review and editing of the manuscript. All authors debated, discussed, edited, and approved the final version of the manuscript. All authors had final responsibility for the decision to submit the manuscript for publication.

Conflict of interest:

AC reports payment from Pfizer for teaching mathematical modelling of infectious diseases. PD reports payment from WHO for consulting on integrated modelling. All other authors declare no competing interests. The views expressed are those of the authors and not necessarily those of the National Institute for Health and Care Research (NIHR), UK Health Security Agency, or the Department of Health and Social Care. NI is currently employed by the Wellcome Trust. However, the Wellcome Trust had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Contributor Information

Pathogen Epidemiology Review Group:

Aaron Morris, Alpha Forna, Amy Dighe, Anna-Maria Hartner, Anna Vicco, Anne Cori, Arran Hamlet, Ben Lambert, Bethan Daniels, Charlie Whittaker, Christian Morgenstern, Cosmo Santoni, Cyril Geismar, Dariya Nikitin, David Jorgensen, Dominic Dee, Ed Knock, Gina Cuomo-Dannenburg, Hayley Thompson, Ilaria Dorigatti, Isobel Routledge, Janetta Skarp, Joseph Hicks, Juliette Unwin, Kanchan Parchani, Keiran Drake, Keith Fraser, Kelly Charniga, Kelly McCain, Lily Geidelberg, Lorenzo Cattarino, Mara Kont, Marc Baguelin, Natsuko Imai-Eaton, Nima Moghaddas, Pablo Guzman, Patrick Doohan, Paula Christen, Rebecca Nash, Richard Shepherd, Rob Johnson, Ruth McCabe, Sabine van Elsland, Sangeeta Bhatia, Sequoia Leuba, Shazia Ruybal-Pesantez, Sreejith Radhakrishnan, Thomas Rawson, Tristan Naidoo, Zulma Cucunuba Perez, and Jack Wardle

Availability of data and materials

https://github.com/mrc-ide/epireview/tree/main/data

Code availability

https://github.com/mrc-ide/epireview, https://github.com/mrc-ide/priority-pathogens PROSPERO CRD42023393345 (https://www.crd.york.ac.uk/prospero/display_record.php?RecordID&RecordID=393345)

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material

Data Availability Statement

https://github.com/mrc-ide/epireview/tree/main/data

https://github.com/mrc-ide/epireview, https://github.com/mrc-ide/priority-pathogens PROSPERO CRD42023393345 (https://www.crd.york.ac.uk/prospero/display_record.php?RecordID&RecordID=393345)

RESOURCES