Skip to main content
American Journal of Public Health logoLink to American Journal of Public Health
editorial
. 2022 Jun;112(6):839–842. doi: 10.2105/AJPH.2022.306831

Collaborative Hubs: Making the Most of Predictive Epidemic Modeling

Nicholas G Reich 1,, Justin Lessler 1,, Sebastian Funk 1,, Cecile Viboud 1, Alessandro Vespignani 1, Ryan J Tibshirani 1, Katriona Shea 1, Melanie Schienle 1, Michael C Runge 1, Roni Rosenfeld 1, Evan L Ray 1, Rene Niehus 1, Helen C Johnson 1, Michael A Johansson 1, Harry Hochheiser 1, Lauren Gardner 1, Johannes Bracher 1, Rebecca K Borchering 1, Matthew Biggerstaff 1
PMCID: PMC9137029  PMID: 35420897

The COVID-19 pandemic has made it clear that epidemic models play an important role in how governments and the public respond to infectious disease crises. Early in the pandemic, models were used to estimate the true number of infections. Later, they estimated key parameters, generated short-term forecasts of outbreak trends, and quantified possible effects of interventions on the unfolding epidemic.1,2 In contrast to the coordinating role played by major national or international agencies in weather-related emergencies, pandemic modeling efforts were initially scattered across many research institutions. Differences in modeling approaches led to contrasting results, contributing to confusion in public perception of the pandemic. Efforts to coordinate modeling efforts in so-called “hubs” have provided governments, healthcare agencies, and the public with assessments and forecasts that reflect the consensus in the modeling community.3–6 This has been achieved by openly synthesizing uncertainties across different modeling approaches and facilitating comparisons between them.

USING MODELS TO SEE INTO THE FUTURE

Epidemic models can give insight into the future course of an epidemic, either through short-term forecasts or through the creation of longer-term planning scenarios that assume a set of future conditions (Figure A, available as a supplement to the online version of this article at http://www.ajph.org).

Forecasts are explicit quantitative statements about probabilities of specific events in the future, such as incidence rates of cases, hospitalizations, or deaths. Such statements can be compared with eventual observations and can be rigorously assessed to demonstrate model accuracy in real time. However, reliable pandemic forecasts can be made for only a short period into the future. This is because of uncertainties about the underlying epidemic process, challenges in anticipating outbreak-altering events (e.g., emergence of a new variant), difficulties in predicting human behavior, and future interventions, which may change in response to the forecasts themselves.

Scenario modeling acknowledges these limitations and gives plausible future epidemic trajectories under a well-defined set of conditions (or assumptions), which in turn can provide stakeholders information to aid in long-term planning. These planning scenarios can be designed to inform a range of decisions, from choosing between different disease control policies to a business determining what must be done to weather coming epidemic disruptions. However, because the assumptions of scenarios are unlikely to occur in exactly the way they have been defined, it is difficult to objectively assess the performance of models making these projections.

Different types of methods may be suitable for generating forecasts and scenarios. On the one hand, statistical and simple mechanistic models often perform particularly well at short-term forecasting. On the other hand, more complex mechanistic approaches sometimes struggle with making accurate short-term forecasts because of challenges in accounting for uncertainty about the underlying state of the system. For longer-term planning scenarios, models must be able to encode scenario assumptions (e.g., waning immunity, behavior changes). This requires structural complexity that many statistical or simple mechanistic models lack.

Whether aimed at forecasting or planning scenarios, there is a lot of variation in how epidemic models are composed. For example, models can vary in terms of what data they use, what they assume about transmission, and what analytic approach they use to produce projections. Because of this, relying on one model is dangerous because there is no guarantee that one model’s choices and assumptions will yield an accurate prediction.

In many fields, there is a long tradition of combining multiple models to mitigate this limitation by providing a single prediction that summarizes the view of the participating models.7 There has been a growing interest in using ensemble methodologies in epidemiology, with notable efforts in forecasting, risk prediction, causal inference, and decision-making.8–12

COORDINATION, COLLABORATION, AND EVALUATION

A modeling “hub” is a consortium of research groups organized around a particular scientific challenge. Hubs in many fields, including climatology and ecology, have helped to build consensus and translate individual model outputs into collective quantitative wisdom. This process often takes place in close collaboration with partners who will ultimately benefit from the modeling output.

Collaborative, multiteam infectious disease modeling efforts have existed in various forms for at least 10 years and have played a central role in the COVID-19 response (Figure B, available as a supplement to the online version of this article at http://www.ajph.org). COVID-19 hub efforts (including forecasting and scenario hubs in the United States and Europe) have leveraged research networks, software, and techniques developed for forecasting efforts around dengue,8 influenza,10 and Ebola.11 These COVID-19 hubs aimed to (1) create real-time modeling systems that provide useful information to partners; (2) create “feedback loops” for modelers by encouraging model development, evaluation, and comparison; and (3) foster a modeling community with an open science ethos.

Despite differences between forecasting and scenario projections, there is still value in taking a “hub approach” to both tasks. Over time, ensembles of multiple models have provided more reliable information than any one model. In the US COVID-19 Forecast Hub, an ensemble was the most consistently accurate forecaster of mortality over the course of the COVID-19 pandemic (through December 2021).3 This finding echoes previous outbreak forecasting research, where ensembles consistently performed well, if not the best, on all evaluated metrics.8,10,11

It is harder to assess performance, or even to define what we mean by accuracy, for long-term scenarios because these projections are made under specific sets of assumptions that may or may not come to pass. Nonetheless, the hub approach provides critical benefits by ensuring that models are focused on the same broad assumptions about the future. Here, too, appropriate ensemble methods can distill results to facilitate interpretation and inform action (Figure A).12

MODELS NOT ORACLES

The ensemble or hub approach is not a guarantee of accuracy or utility. The US COVID-19 Forecast Hub ensemble (including many component models) has struggled to produce accurate forecasts of cases and hospitalizations during periods of rapidly changing epidemic dynamics, such as the US peak of the winter wave in early 2021 or the rapid increases associated with the Delta variant in summer 2021 or in winter 2021–2022.3 Likewise, although longer-term projections from the COVID-19 Scenario Modeling Hub projected a Delta-associated resurgence in the United States, the ensemble significantly underestimated its speed and size, even though there were no clear deviations from scenario assumptions.13

However, even when projections are wrong, the hubs play a role in enhancing the scientific rigor and integrity of epidemic modeling. The coordination provided by hubs ensures that approaches may be prospectively and objectively evaluated in uniform, fair, and unbiased comparisons. Furthermore, by evaluating many models simultaneously, we can gain insight into whether successes and failures are properties of individual approaches or represent a challenge to the field as a whole.

THE SHARED CHALLENGE OF DATA

In contrast with weather forecasting, which has seen sustained investment in data collection infrastructure for decades, public health surveillance systems lag far behind. The lack of timely, granular, and relevant data limits model performance. By partnering with parallel data curation efforts, hubs can help the community access critical data sources and overcome challenges together.

Data challenges are present even in the most seemingly straightforward of model inputs, such as the number of reported COVID-19 cases in a geographic area or jurisdiction. Case definitions can vary by geography and time, and reporting frequencies and rates of testing have changed over time. These issues have led to fundamental changes in what a reported case represents during the pandemic.

To help mitigate these data issues, COVID-19 modeling hubs have developed close relationships with data curation teams.14,15 These relationships have been critical to COVID-19 hubs, both in providing a source of common “ground truth” data on which models can be fit, evaluated, and compared and in being stores of expertise in dealing with heterogeneous and inconsistent data streams. Active communication between data and modeling communities has proved critical. This process ensures that modeling teams have information about data anomalies and changes in reporting that could fundamentally alter apparent case trajectories and hence lead to distorted model projections.

Curated data repositories can also help provide modeling teams with easy access to granular data on the wide array of other phenomena that might affect the subsequent course of the epidemic. These include mobility statistics, genomic sequences, wastewater surveillance, government responses, and behavioral data.

CONCLUSIONS

During the pandemic, model and data curation evolved in real time. This is far from optimal; we do not learn how to forecast a cyclone while it is happening. The value proposition of the hub coordination model is two-fold. First, scientifically, there is value in building infrastructure with standing capability to evaluate which models, ensemble approaches, and data were most useful at different times during the outbreak response. Second, operationally, there is value in developing procedures that harness the insights of a diverse network of scientists while guarding against groupthink and overconfidence.12

As researchers, system developers, and public health officials who have been deeply involved in the real-time operation of modeling hubs during the COVID-19 pandemic and prior epidemics, we believe the hub approach is a vital path forward for predictive disease modeling efforts. Bringing together multiple modeling teams to answer pressing questions can provide partners with important information during emerging outbreaks. At their best, hubs provide the leadership and operational structure to ensure that model outputs are solicited widely, stored centrally, synthesized efficiently, communicated clearly, and evaluated honestly.

Modeling hubs and public data curation are and will remain crucial pieces of infrastructure for supporting public health decision-making in outbreak crises. It will be important to extend these approaches so they can be adopted in low- and middle-income countries to inform decisions in resource-constrained settings. Critical issues include building local capacity for modeling and strengthening global connections between modelers and policymakers.

In all, the systems developed before and matured during the COVID-19 pandemic are just a beginning. They must be nurtured and sustained between epidemics so they can help turn the tide the next time human populations face a pandemic.

ACKNOWLEDGMENTS

We acknowledge the hundreds of modelers who have contributed to the hubs, in many cases by setting aside other responsibilities to make time to develop new models. In addition, we thank the many team members of the hubs themselves, whose day-to-day efforts keep the hubs operating smoothly. N. G. Reich was supported by the National Institute of General Medical Sciences (R35GM119582) and the Centers for Disease Control and Prevention (CDC; U01 IP001122-01). S. Funk was supported by the Wellcome Trust (210758/Z/18/Z). A. Vespignani is supported by CDC-HHS-6U01IP001137-01 and cooperative agreement no. NU38OT000297 from the Council of State and Territorial Epidemiologists. R. J. Tibshirani is funded by a CDC Center of Excellence grant and a gift from Google.org. K. Shea acknowledges funding from National Science Foundation (NSF) awards DEB-1911962, DEB-1908538, and NSF COVID-19 RAPID awards DEB-2028301 and DEB-2126278. M. Schienle and J. Bracher acknowledge funding by the Helmholtz Foundation IPV-Project SIMCARD. R. Rosenfeld is funded by a CDC Center of Excellence grant. H. Hochheiser acknowledges support from the National Institute of General Medical Sciences (U24GM132013). R. K. Borchering acknowledges support from two NSF COVID-19 Rapid Response Research (RAPID) awards (principal investigator, K. Shea).

Note. The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention or the National Institutes of Health. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the US government.

CONFLICTS OF INTEREST

J. Lessler has served as an expert witness in cases where transmission of SARS-CoV-2 and the length of the pandemic were of issue. The remaining authors have no conflicts of interest to declare.

REFERENCES

  • 1.Poletto C, Scarpino SV, Volz EM. Applications of predictive modelling early in the COVID-19 epidemic. Lancet Digit Health. 2020;2(10):e498–e499. doi: 10.1016/S2589-7500(20)30196-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Biggerstaff M, Cowling BJ, Cucunubá ZM, et al. WHO COVID-19 Modelling Parameters Group. Early insights from statistical and mathematical modeling of key epidemiologic parameters of COVID-19. Emerg Infect Dis. 2020;26(11):e1–e14. doi: 10.3201/eid2611.201074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Cramer EY, Ray EL, Lopez VK, et al.
  • 4.Borchering RK, Viboud C, Howerton E, et al. Modeling of future COVID-19 cases, hospitalizations, and deaths, by vaccination rates and nonpharmaceutical intervention scenarios — United States, April–September 2021. MMWR Morb Mortal Wkly Rep. 2021;70(19):719–724. doi: 10.15585/mmwr.mm7019e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bracher J, Wolffram D, Deuschel J, et al. A pre-registered short-term forecasting study of COVID-19 in Germany and Poland during the second wave. Nat Commun. 2021;12(1):5173. doi: 10.1038/s41467-021-25207-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.European Covid-19 Forecast Hub. Available. 2022. https://covid19forecasthub.eu
  • 7.Gneiting T, Raftery AE. Weather forecasting with ensemble methods. Science. 2005;310(5746):248–249. doi: 10.1126/science.1115255. [DOI] [PubMed] [Google Scholar]
  • 8.Johansson MA, Apfeldorf KM, Dobson S, et al. An open challenge to advance probabilistic forecasting for dengue epidemics. Proc Natl Acad Sci U S A. 2019;116(48):24268–24274. doi: 10.1073/pnas.1909865116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Pirracchio R, Petersen ML, Carone M,, Rigon MR, Chevret S, van der Laan MJ. Mortality prediction in intensive care units with the Super ICU Learner Algorithm (SICULA): a population-based study. Lancet Respir Med. 2015;3(1):42–52. doi: 10.1016/S2213-2600(14)70239-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.McGowan CJ, Biggerstaff M, Johansson M, et al. Collaborative efforts to forecast seasonal influenza in the United States, 2015–2016. Sci Rep. 2019;9(1):683. doi: 10.1038/s41598-018-36361-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Viboud C, Sun K, Gaffey R, et al. The RAPIDD Ebola forecasting challenge: synthesis and lessons learnt. Epidemics. 2018;22:13–21. doi: 10.1016/j.epidem.2017.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shea K, Runge MC, Pannell D, et al. Harnessing multiple models for outbreak management. Science. 2020;368(6491):577–579. doi: 10.1126/science.abb9934. [DOI] [PubMed] [Google Scholar]
  • 13.Truelove S, Smith CP, Qin M, et al. 2022. [DOI]
  • 14.Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis. 2020;20(5):533–534. doi: 10.1016/S1473-3099(20)30120-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Reinhart A, Brooks L, Jahja M, et al. An open repository of real-time COVID-19 indicators. Proc Natl Acad Sci U S A. 2021;118(51):e2111452118. doi: 10.1073/pnas.2111452118. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from American Journal of Public Health are provided here courtesy of American Public Health Association

RESOURCES