The Global Health Security Index, published in October 2019 by public health experts (1), rated nations by their readiness to respond to infectious disease outbreaks. The United States ranked first, whereas New Zealand trailed in 35th place. Yet when the coronavirus disease 2019 (COVID-19) pandemic struck months later, these rankings turned out to be unreliable (2): The United States has to date reported the most deaths, whereas New Zealand fares considerably better. These results point to a fundamental challenge more than to the failings of a particular assessment approach. They may demonstrate that, for all our careful testing of pharmaceuticals, researchers lack a solid evidence base regarding the effectiveness of health policies—against pandemics, and more widely.
To better understand the implications and effectiveness of COVID-19–related policy and public health measures—ranging from school openings to mask-wearing to travel restrictions—we need to do a much better job devising controlled experiments. Image credit: Shutterstock/Syda Productions.
Experts and others had strong opinions on how to respond as COVID-19 spread to all corners of the world, but these were educated guesses at best. Authorities at every level—countries, provinces, states, regions, districts, and towns—acted vigorously, but variably (3), on a wide range of policy questions (see Table 1). Some locked down briefly, some, for long periods, and others, not at all. Travel restrictions, school closures, the allocation of scarce COVID-19 medical resources ranging from tests to intensive care to vaccines—each were instituted differently from one jurisdiction to another with little consistent reason, and then relaxed or reversed, often as incoherently as they had been introduced. The world’s response to the pandemic was essentially a potpourri, the equivalent of exposing the world’s population to a massive, uncontrolled cluster trial without systematic and detailed outcome assessment.
Table 1.
A variety of health policies have been implemented across countries during the COVID-19 pandemic. Image credit: Based on ref. 14, Oxford Covid-19 Government Response Tracker.
In some ways, this situation was understandable for a new disease with unknown effects, and given the urgent need to have a locally oriented policy. But after two years of trial and error, how much have we learned?
Unfortunately, the uncontrolled nature of these pandemic experiences makes it hard to learn much by analyzing them, except for the broadest generalizations (4, 5). To illustrate: Denmark and Germany, two Western European countries with many affinities, had different policies on school closures. But it is hard to conclude from the data whether differences between Denmark and Germany in infection rates, for example, stemmed from contrasting school-closure policies or from other confounding differences between these two countries. Indeed, the differences may relate to the immediate medical, legal, and political pressures that often spurred policies that either kept schools open or closed in different European countries early in the pandemic (6).
This series of worldwide yet uncontrolled actions has missed, and continues to miss, opportunities to build the knowledge base that we lacked at the outset of the pandemic. To improve the situation, we should turn to randomized, controlled experiments now.
Health Policy Trials
Health policy trials (HPTs) are experiments to assess the consequences, both desirable and undesired, of innovations in health policy or to compare the relative merits and drawbacks of alternative health policies (7). They are typically large-scale and use the routine delivery of care and other policy interventions as their study sites. Crucially, HPTs routinely use randomization of exposures and prespecified, systematic, and detailed outcome assessment to enhance rigor—and learning.
HPTs are a relatively novel concept. Probably the most influential HPT to date was the 2013 Oregon Medicaid study. The state of Oregon offered Medicaid health insurance to a randomly selected set of individuals whose income had hitherto been too high to qualify for this assistance. Economists urged the state to compare this cohort with other individuals like them who did not rely on Medicaid. The resulting scientific study provided the best evidence to date that gaining access to Medicaid decreases depression, increases self-reported health, and decreases financial hardship—but also currently has little impact on other measured health outcomes (8, 9). The Oregon Medicaid study spawned several HPTs (8), although others took place as well (9).
At the height of the COVID-19 pandemic, an HPT might have, for example, temporarily randomized districts to compare alternative approaches to school opening, contact tracing, vaccination delivery, nonpharmaceutical public health interventions, or the choice between mandatory versus voluntary isolation measures. It could measure a broad set of relevant outcomes in identical ways across districts with different COVID-19 intervention exposures, including not only COVID-19 transmission but also non–COVID-19 health use and outcomes, children’s educational achievements and cognitive and social development, and public sentiment and trust.
For example, as we write, the BA1 Omicron subvariant has waned, later subvariants so far seem relatively mild, and some schools, school districts, and state governments that mandated masking earlier in the pandemic are struggling with the question whether the time is right to permit school attendance without a mask. Some benefits of removing that burdensome mandate are palpable, yet there are concerns that these policies might spread the disease and its sequelae (with current or future variants) to fellow students, teachers, and families. There is also much that we do not know regarding the exact impact of removing the mandate on community spread, on children’s educational achievements, on willingness to attend school, on parental employment, on children’s social connections and mental health, and on much else. Imagine that there were two approaches, each deemed permissible at this point in time:
-
1.
Mandate: Children are encouraged by a suite of publicity activities to wear masks and, additionally, are permitted to attend school only if masked.
-
2.
Information: Children are encouraged to wear masks by that same suite of activities but not mandated to wear masks at school.
There remains genuine uncertainty regarding the effects of each of these approaches on important health and social outcomes. Instead of leaving the matter to outspoken groups demanding one approach or the other, there is much to be said for randomizing school districts to approach 1) versus approach 2) and systematically measuring prespecified outcomes in exactly identical ways. Short-term outcomes such as patterns of community spread in schools randomized to one approach versus the other could be analyzed months later and inform policy locally and in similar settings immediately. Long-term outcomes, such as patterns of parental employment and other economic and social consequences (10), could in many cases still be meaningfully analyzed and improve policy later, depending on the intensity and patterns of randomized exposures over time. The results of each analysis would be made public so that additional school authorities could learn from the experience.
The Price of Not Knowing
Had we enacted HPTs as a routine component of health policy, by now we would have known much more about COVID-19. HPTs will require explaining to wide publics the importance of experimentation (4). We should start doing so now, to prepare for rapid knowledge generation in future public health emergencies and to optimize routine health policy in many areas (4, 10).
In many high-income countries, one possible application might involve the current trend to remove mask mandates. In those low- and middle-income countries that are yet to receive enough vaccines to cover their populations, vaccine rollout could be cluster-randomized between different ways to arrange vaccination points, for impact on vaccination rates. Other HPTs there could compare different pro-vaccination messages for their respective impacts on trust in vaccines and in health personnel.
In the United States, Brazil, and elsewhere, extreme politicization of COVID-19 response and suspicion of public health officials and of science itself are costing many lives (5, 11). The data and analysis from randomized HPTs could offer robust evidence and more neutral persuasion than the advice of partisan politicians and of public health officials erroneously dismissed as partisan, potentially quelling some opposition.
In practice, setting up an HPT would mean disallowing a particular lobby, circumstance, unfounded conviction, or a conviction founded only on indirect information dictate local policy. Instead, central decision makers would randomize districts, postcodes, clinics, schools, vaccine distribution centers, and other clusters or, in some cases, individuals to, say, one of two plausible options. Short-term and long-term differences would be rigorously studied, with potential benefits for both learning and demonstrated learning. The exclusive trialing of options that would be permissible otherwise would greatly limit any downsides.
Ethical HPT Oversight
Currently, HPTs are typically approved by either social science institutional review boards and research ethics committees or medical ones. When some of the end points are medical (e.g., the impact of social policy on infection rates) or the HCTs concern operations of medical systems (e.g., the impact of vaccination delivery stations on economic outcomes), it is hard to avoid a medical oversight body. These traditional institutional arrangements are not always well suited to this purpose. A high priority of most medical research oversight bodies is clinical trials of unapproved medications, vaccines, and biologics. It is unclear whether, in the case of HPT oversight, these existing oversight bodies and related regulatory frameworks would be very helpful. The expertise needed for effective review of HPTs is not only biomedical but also economic, environmental, and social-scientific. When HPTs randomize populations between individually permissible policies, most are unlikely to expose these populations to higher risk than either of these permissible alternatives would. Personal informed consent is broadly seen as unrealistic in HPTs (12), which affect entire populations routinely using clinics, schools, and so forth. This renders infeasible either the level of information or the voluntariness of high-quality clinical trial informed consent (13), yet ensuring the quality of informed consent is what most medical oversight bodies see as their main roles. A major reason for the highly demanding ethical standards in clinical trials has to do with past abuses that simply do not exist in the short history of HPTs (4), where onerous, unnecessary requirements could stifle inquiry. Ethical oversight mechanisms guiding HPTs may also have to be especially context-specific and considerate of the variable governance capacities of different countries, more so than is standard in international clinical trials.
As we prepare ourselves for the endemic stage of COVID--19, for future public health emergencies, and for long--term routine health policy, we can and must do better. Randomized, controlled health policy trials should become routine.
A radical solution for identifying the best HPT oversight system would be to run an HPT of HPTs. This would involve randomizing jurisdictions to adopt different oversight regimens for the various HPTs and compare their performance years later. Even if that solution is unrealistic, our default should not be handing clinical trial oversight institutions the oversight of HPTs. Governance of the latter could be the responsibility of elected officials and professional staff, just like other policies on health, or of new and more suitable oversight bodies, created especially for HPTs.
As COVID-19 enters its endemic phase, outstanding policy questions remain. For example, we need to understand the safest ways to restore life to normal in some countries, and we need to determine the most efficient ways to disburse vaccines. These questions are ripe for HPTs. Conducting HPTs now would help countries continue to grapple with the disease in one form or another—whatever their preparedness and performance “rankings” might be. This should also become a routine part of how we introduce big and small reforms to non-COVID healthcare delivery and to health policy outside pandemics. Implementing health policies that are not grounded in strong evidence wastes resources and could cause avoidable harm. Therefore, when rigorous learning via HPTs is feasible, it is usually unethical to subject populations to untested policies—to anything other than an HPT.
For more than two years of COVID-19, we have missed opportunities to build the knowledge base that was unavailable at the outset of the pandemic. By consciously creating a rigorous research program that stems from the varied, often haphazard policy developments of the last two years, we could have by now substantially expanded that knowledge base.
As we prepare ourselves for the endemic stage of COVID-19, for future public health emergencies, and for long-term routine health policy, we can and must do better. Randomized, controlled health policy trials should become routine. The constant stream of evidence that would result could reveal the surest paths to preserving public health and welfare.
Acknowledgments
We acknowledge funding for this study by the Wellcome Trust, Grant No. 208766/Z/17/Z. T.B. was supported by the Alexander von Humboldt Foundation through the Alexander von Humboldt Professor Award, funded by the Federal Ministry of Education and Research; the European Union; and by the National Institute of Child Health and Human Development of NIH (R01-HD084233), National Institute on Aging of NIH (P01-AG041710), National Institute of Allergy and Infections Diseases of NIH (R01-AI124389 and R01-AI112339), as well as the Fogarty International Center of NIH (U2R TW012140-01). S.A.M. was supported by the Olympia Morata Program of Heidelberg University.
Research Ethics Statement.
This study did not receive nor require ethics approval, as it does not involve human & animal participants.
Footnotes
The authors declare no competing interest.
Any opinions, findings, conclusions, or recommendations expressed in this work are those of the authors and have not been endorsed by the National Academy of Sciences.
This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2119887119/-/DCSupplemental.
Data Availability
All study data are included in the article and/or supporting information.
Change History
June 13, 2022: Reference 1 has been updated.
References
- 1.GHS Index Staff, Global Health Security Index: Building Collective Action and Accountability. https://www.ghsindex.org/wp-content/uploads/2021/11/2019-Global-Health-Security-Index.pdf. Accessed 31 January 2022.
- 2.Berman P., Upstream Factors in Governments’ Responses to the COVID-19 Pandemic: Learning from Jurisdictional Case Studies (GHP, Harvard T.H. Chan School of Public Health, Boston, MA, 2021). [Google Scholar]
- 3.Hale T., et al. , A global panel database of pandemic policies (Oxford COVID-19 Government Response Tracker). Nat. Hum. Behav. 5, 529–538 (2021). [DOI] [PubMed] [Google Scholar]
- 4.Bichay N., Randomized controlled trials of public policy. Michigan Policy Wonk Blog (8 June 2016). http://ippsr.msu.edu/public-policy/michigan-wonk-blog/randomized-controlled-trials-public-policy. Accessed 21 February 2022.
- 5.Haldane V., et al. , From response to transformation: How countries can strengthen national pandemic preparedness and response systems. BMJ 375, e067507 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lindblad S., et al. , School lockdown? Comparative analyses of responses to the COVID-19 pandemic in European countries. Eur. Educ. Res. J. 20, 564–583 (2021). [Google Scholar]
- 7.Newhouse J. P., Normand S. T., Health policy trials. N. Engl. J. Med. 376, 2160–2167 (2017). [DOI] [PubMed] [Google Scholar]
- 8.National Bureau of Economic Research, Oregon Health Insurance Experiment. https://www.nber.org/programs-projects/projects-and-centers/oregon-health-insurance-experiment. Accessed 3 April 2021.
- 9.Baicker K., et al. ; Oregon Health Study Group, The Oregon experiment—Effects of Medicaid on clinical outcomes. N. Engl. J. Med. 368, 1713–1722 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Weijer C., Hemming K., Phillips Hey S., Fernandez Lynch H., Reopening schools safely in the face of COVID-19: Can cluster randomized trials help? Clin. Trials 18, 371–376 (2021). [DOI] [PubMed] [Google Scholar]
- 11.Kenworthy N., Koon A. D., Mendenhall E., On symbols and scripts: The politics of the American COVID-19 response. Glob. Public Health 16, 1424–1438 (2021). [DOI] [PubMed] [Google Scholar]
- 12.Berner-Rodoreda A., et al. , Consent requirements for testing health policies: An intercontinental comparison of expert opinions. J. Empir. Res. Hum. Res. Ethics, 10.1177/15562646221076764 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Faden R. R., et al. , An ethics framework for a learning health care system: A departure from traditional research ethics and clinical ethics. Hastings Cent. Rep. 43, S16–S27 (2013). [DOI] [PubMed] [Google Scholar]
- 14.Blavatnik School of Government Staff, Codebook for the Oxford Covid-19 Government Response Tracker. https://github.com/OxCGRT/covid-policy-tracker/blob/e02db68ec6220f73180bbe43613cd503957e9754/documentation/codebook.md. Accessed 7 June 2021.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
All study data are included in the article and/or supporting information.


