Skip to main content
IEEE - PMC COVID-19 Collection logoLink to IEEE - PMC COVID-19 Collection
letter
. 2021 Mar 24;28:683–687. doi: 10.1109/LSP.2021.3068072

Quickest Detection of COVID-19 Pandemic Onset

Paolo Braca 1,, Domenico Gaglione 1, Stefano Marano 2, Leonardo Maria Millefiori 1, Peter Willett 3, Krishna R Pattipati 3
PMCID: PMC8216249  NIHMSID: NIHMS1695814  PMID: 34163125

Abstract

This letter develops an easily-implementable version of Page's CUSUM quickest-detection test, designed to work in certain composite hypothesis scenarios with time-varying data statistics. The decision statistic can be cast in a recursive form and is particularly suited for on-line analysis. By back-testing our approach on publicly-available COVID-19 data we find reliable early warning of infection flare-ups, in fact sufficiently early that the tool may be of use to decision-makers on the timing of restrictive measures that may in the future need to be taken.

Keywords: COVID-19 pandemic, MAST, pandemic waves, quickest detection

I. Introduction

We develop a version of Page's CUSUM quickest-detection procedure [1][4], applicable to a family of composite-hypothesis changes. We refer to it as MAST — the mean-agnostic sequential test. Consider a set of independent Gaussian observations Inline graphic of constant known standard deviation Inline graphic and unknown mean sequence Inline graphic. At an unknown time, the mean switches from being less than some prescribed limit (but otherwise unknown) to larger than some prescribed limit (but otherwise unknown). The goal is to detect the change, if any, as soon as possible. This framework represents a convenient abstraction of many problems of practical interest. Here we discuss its application to the detection of COVID-19 pandemic waves.

The outbreak of the COVID-19 infection is certainly one of the most serious global crises of the last two decades. The response of the research community was also extraordinary, and comprehensive reviews are recently appearing in the literature [5], [6]. To contain the “first wave” of the COVID-19 pandemic in the spring of 2020, strict lockdown measures were imposed in many countries, with huge societal and economic costs [7][12]. In the fall of 2020, a “second pandemic wave” seems to have grown in many regions of the world, and governments and authorities were again faced with the dilemma of if and when to impose social restrictions. In this work, after developing the MAST quickest detection procedure, we show how it can provide valuable support to make informed and rational decisions, with a focus on detecting the second and subsequent waves of the COVID-19 pandemic.

II. MAST: A Novel Quickest Detection Test

Along the same lines of the derivations of Page's test, see e.g., [2, Sec. 2.2.3] or [13, Sec. 8.2], we consider the following decision problem involving two statistical hypotheses with independent data:

II.

In (1), Inline graphic are the data available to the decision maker, Inline graphic is an unknown deterministic change time and the standard deviation Inline graphic is assumed known. Note in (1) that in the case Inline graphic, the alternative hypothesis is equivalent to the null one, i.e., there is no change in regime. Different from the classical assumption of Page's test, in our problem the expected values before and after the change are unavailable. Accordingly, we model Inline graphic and Inline graphic as unknown deterministic sequences and we assume that they satisfy the following constraints:

II.

Thus, model (1) contains Inline graphic unknown parameters: the index of change Inline graphic and the two sequences of expected values. In (2), if Inline graphic represents the ratio of daily positive cases in a region, the most natural choice is Inline graphic, but it is convenient to consider the general case having an implied hysteresis. For example, Inline graphic may be specified based on tolerable time to reach hospital capacity, while Inline graphic may be based on the time citizens can endure restrictions before reopening the economy or tolerable level of positive cases.

One might also consider

II.

in place of (2). In some sense, this might be more natural, since the mean levels before and after the change are still assumed unknown, but are merely constant. However, formulation (3) does not admit a recursive Page-like procedure whereas MAST that results from (2) does.

According to the Generalized Likelihood Ratio Test (GLRT) principle [14], [15], the decision statistic for problem (1) is

II.

where the equality follows by recognizing that each factor of the products involves a single value of Inline graphic or Inline graphic and making explicit the constraints in (2). The suprema over Inline graphic and Inline graphic appearing in the above expression can be computed in closed form, as follows:

II.

which means that the ML (maximum likelihood) estimates of the unknown parameters are, respectively,

II.

This yields the GLRT statistic in the form

II.

or, equivalently, taking the logarithm:

II.

where

II.

The passage from the controlled to the critical regime is declared at the smallest Inline graphic such that

II.

where the threshold level Inline graphic is selected to trade-off decision delay and risk, two quantities that will be defined in Section III.

The test in (10) will be referred to as MASTInline graphic with boundaries Inline graphic and Inline graphic. The subscript Inline graphic appended to Inline graphic denotes its dependence on the stream of data Inline graphic, and the subscript Inline graphic appended to Inline graphic denotes its dependence on Inline graphic. Finally, by introducing the non-linearity

II.

we have Inline graphic.

As a sanity check, let us assume that values of Inline graphic closer to Inline graphic are confused with Inline graphic and, likewise, values of Inline graphic closer to Inline graphic are confused with Inline graphic. Then, we see from (9) that the contribution to Inline graphic provided by the sample Inline graphic is Inline graphic, where the negative sign applies to the former case and the positive one to the latter. In the actual operation of Inline graphic, the contribution given by the sample Inline graphic is regulated by its distance to the boundaries, as shown in (11):

  • values Inline graphic give a negative contribution proportional to the square of the distance of Inline graphic from the upper boundary Inline graphic;

  • values Inline graphic give a linear contribution, whose sign depends on which boundary Inline graphic is closest to;

  • values Inline graphic give a positive contribution proportional to the square of the distance of Inline graphic from the lower boundary Inline graphic.

Using the non-linearity of (11) in (8), one gets

II.

where we have used Inline graphic.

The MASTInline graphic decision statistic (12) can be expressed in recursive form. To see this, let us define Inline graphic, with Inline graphic, Inline graphic. By using the notation Inline graphic, we see that (12) can be written as Inline graphic. Then,

II.

We have thus arrived at a recursive expression for the decision statistic: Inline graphic and, for Inline graphic, Inline graphic, if Inline graphic, and Inline graphic, otherwise. Equivalently: Inline graphic, and, for Inline graphic,

II.

We now consider two special cases. First, let Inline graphic, a case referred to as the MASTInline graphic detector, with decision statistic Inline graphic and, for Inline graphic,

II.

Further assuming Inline graphic in (15), yields a decision procedure that we simply call MAST, whose decision statistic Inline graphic is denoted by Inline graphic: Inline graphic and, for Inline graphic,

II.

The second special case is when Inline graphic and Inline graphic, for some Inline graphic, which is relevant in connection to Page's test, as discussed next. As is well-known, if the mean values of the observed sequence before and after the change are constant and known, say Inline graphic and Inline graphic, the statistic to be compared to a suitable threshold level would be the CUSUM [1][3]: Inline graphic and, for Inline graphic,

II.

For Inline graphic, Eq. (11) gives Inline graphic, which shows that the decision statistic Inline graphic in (12) operates exactly as the Page's test for samples Inline graphic.

Different optimality criteria have been advocated for the CUSUM test. The “first-order” criterion considers the asymptotic situation in which the mean time between false alarms goes to infinity and asserts that the CUSUM minimizes the worst-case mean delay, where the qualification “worst” refers to both the change time and the behavior of the process before change [2, p. 166]. The test based on (17) is in this sense the optimal quickest-detection Page's test.

It is worth noting that the MAST statistic in (16) is formally obtained by replacing the unknown value of Inline graphic appearing in the CUSUM statistic, with an estimate Inline graphic (constant factors can be incorporated in the threshold). This suggests an analogy between MAST for quickest-detection problems and the energy detector for testing the presence of an unknown time-varying deterministic signal buried in Gaussian noise, in the classical hypothesis testing framework [14].

III. Performance Assessment

The performance of MASTInline graphic is expressed in terms of mean delay time Inline graphic and the risk Inline graphic. The mean delay Inline graphic is the difference between the time at which the MASTInline graphic statistic Inline graphic crosses a preassigned threshold level Inline graphic, see (10), and the time of passage from the controlled to the critical regime. In the critical regime, the pandemic grows exponentially fast and it is therefore important to ensure that Inline graphic be as small as possible. This requirement is in contrast with the requirement Inline graphic. The risk Inline graphic is defined as the reciprocal of the mean time between two false alarms.1 In turn, the mean time between false alarms is the mean time between two threshold crossings, assuming that the decision statistic is reset to zero at any threshold crossing event, occurring in the controlled regime. Because of the unwelcome social and economic impact of the measures presumably taken by the authorities when passage into the critical regime is detected, it is evident that Inline graphic must be extremely small. The same performance indices Inline graphic and Inline graphic used to characterize MASTInline graphic are used for the Page's test.

We now investigate the performance of MASTInline graphic by computer experiments, limiting the analysis to the case Inline graphic, i.e., the simple MAST. The performance of the Page's test is used as a benchmark. Let us consider the following “scenario 0”. Fix Inline graphic. Suppose that the state of nature (mean value of the Inline graphic's) is Inline graphic for all Inline graphic in the controlled regime; likewise, suppose Inline graphic for all Inline graphic in the critical regime. By standard Monte Carlo counting, for MAST we found that the delay Inline graphic varies almost linearly with the threshold level Inline graphic, and that Inline graphic varies almost linearly with Inline graphic. The same approximate behavior is found, again by standard Monte Carlo counting, for the clairvoyant Page's test that is aware of the mean values Inline graphic and Inline graphic: the mappings Inline graphic and Inline graphic are approximately linear. These numerical analyses are not detailed for the sake of brevity. The observed behavior is known for the Page's test, at least when the threshold Inline graphic is sufficiently large, in view of the Wald's approximation, see, e.g. [2, Eq. 5.2.44]. In the present Gaussian case, more accurate formulas — known as Siegmund's approximations — are also available [2, Eqs. 5.2.64, 5.2.65].

We assume that the aforementioned linear mappings observed for MAST and Page's test hold true for any value of the threshold, and this assumption allows us to consider values of the mean delay and (especially) of the risk that would be difficult to obtain by standard Monte Carlo analysis. In this way, we obtain the operational curve of the two decision systems shown in Fig. 1. The operational curve is the relationship between Inline graphic and Inline graphic. As expected, Page's test outperforms the MAST, because the Page's test is optimal for the case addressed in scenario 0.

Fig. 1.

Fig. 1.

Operational characteristic (risk Inline graphic versus decision delay Inline graphic) of the MAST quickest detection test, compared to the benchmark Page's test. Three scenarios are considered, as described in the main text. In scenario 0, Page's test is optimal. MAST outperforms Page's test in scenarios 1 and 2, in which the sequences Inline graphic and Inline graphic are time-varying. Scenario 2, in particular, mimics the actual behavior of the sequences, as observed in COVID-19 pandemic data, see Section IV.

The same numerical analysis has been conducted for “scenario 1” and “scenario 2,” also shown in Fig. 1. In scenario 1, we suppose that in the controlled regime, any Inline graphic is an instantiation of a uniform random variable with support Inline graphic, while in the critical regime any Inline graphic is an instantiation of a uniform random variable with support Inline graphic. In scenario 2, instead, we suppose that the sequences Inline graphic and Inline graphic are sinusoidal with a period of 75 days.2 Specifically, in the controlled regime the sinusoid oscillates in Inline graphic, while in the critical regime it oscillates in Inline graphic. To implement the Page's test in both scenarios 1 and 2, it is assumed that the mean values are constant, i.e., Inline graphic and Inline graphic, as in scenario 0. Clearly, no assumption about the mean values is instead needed for implementing the MAST test, except that they are bounded by one. In Fig. 1, we see that MAST outperforms Page's test, confirming its effectiveness when the mean values Inline graphic and Inline graphic are unknown, except for being bounded as shown in (2).

IV. Application to COVID-19 Pandemic Data

Starting from the landmark SIR model developed in [17], a multitude of sophisticated epidemiological models have been proposed to describe the pandemic evolution, based, e.g., on stochastic evolution of epidemic compartments [18][22], or metapopulation networks, [23], [24], just to cite two examples. The trend in the topical literature is to conceive increasingly complex models, often suitable for analysis by big-data techniques. The main goal of these models is to predict mid/long-term evolution of the infection. Our focus, instead, is to quickly detect the onset of the exponential growth. With this aim, we consider an abbreviated observation model, built on the concept that the pandemic evolution is essentially a multiplicative phenomenon.

We model the number of new positive individuals on day Inline graphic, say Inline graphic, as the number Inline graphic of new positive individuals on day Inline graphic, multiplied by a random variable Inline graphic. Further including a “noise” term Inline graphic, yields the scalar discrete-time state equation Inline graphic, Inline graphic, for some initial state Inline graphic. Such a recursion, under various assumptions for the sequences Inline graphic, is known as a perpetuity and appears in many disciplines [25][27]. We assume that the noise term Inline graphic is negligible, yielding:3

IV.

for some Inline graphic. In this article, we refer to model (18), in which Inline graphic are independent random variables. This is akin to the popular random walk model, with the independence of the increments of the random walk replaced by the independence of the ratios Inline graphic. Model (18) is derived from SIR-like models and validated on COVID-19 data in [16], where it is also shown that the Inline graphic's closely follow a Gaussian distribution with (unknown) time-varying expected value Inline graphic, and a common standard deviation4 Inline graphic.

As long as Inline graphic, the sequence Inline graphic tends to decay exponentially to zero, while, for Inline graphic, Inline graphic tends to increase exponentially fast. We are interested in quickly detecting the passage from the former situation (a controlled regime) to the latter (critical). Detecting this change can be cast in terms of a binary decision problem between two hypotheses, referred to as the null and the alternative, as shown in (1).

An example of application of MAST to COVID-19 data is provided in Fig. 2. The abscissa point at which the MAST statistic crosses the threshold represents the day at which the onset is detected. The test threshold is state-dependent, as discussed in [16]. Then, for clarity of illustration, only the smallest and largest thresholds corresponding to the risk Inline graphic are shown, which for many states makes only a few days difference as to the time of alert. One observation is that restrictive measures have not been adopted in as timely a manner as suggested by the MAST analysis. The reader is referred to [12], [16], [28][30] for details. Several aspects of the MAST analysis of COVID-19 data deserve further study. These include the pre-processing to clean the data from gross errors (e.g., asynchronous or unreported data); generalization of the approach to analyze other publicly available time-series (e.g., number of hospitalized, number of deaths) and even as a vector of observations; on-line estimation of the variance to make the detector robust to statistical fluctuations, often observed in COVID-19 data.

Fig. 2.

Fig. 2.

MAST decision statistic computed for 10 US states and used to detect the onset of the COVID-19 s wave. The dashed horizontal lines represent the smallest and largest thresholds corresponding to Inline graphic, for the ensemble of the ten states. Curves are prolonged beyond threshold crossing for clarity.

V. Conclusion

This article derived a sequential test called MAST, which is used in [16] to detect passage from the controlled regime in which the COVID-19 pandemic is restrained, to the critical regime in which the infection spreads exponentially fast. MAST is a variation of the celebrated Page's test based on the CUSUM statistic, designed for cases in which the expected values of the data are bounded below a lower barrier Inline graphic in the controlled regime, and above an upper barrier Inline graphic in the critical one, but are otherwise unknown. We show that MAST admits a recursive form and in the simplest case Inline graphic, is formally obtained from the Page's test with nominal expected values Inline graphic, by replacing Inline graphic with an estimate thereof. The performance of MAST is investigated by computer experiments. If the expected values of the data are constant and known, the performance loss of MAST with respect to the optimal Page's test is moderate. In pandemic scenarios, lacking knowledge of the expected values of the data, MAST can well overcome the Page's test designed with nominal values of the unknowns.

Funding Statement

The work of Krishna R. Pattipati was supported in part by the U.S. Office of Naval Research, in part by the U.S. Naval Research Laboratory under Grants N00014-18-1-1238 and N00173-16-1-G905, and in part by the Space Technology Research Institutes from National Aeronautics and Space Administration's (NASA's) Space Technology Research Grants Program under Grant 80NSSC19K1076. The work of Peter Willett was supported by the AFOSR under contract FA9500-18-1-0463.

Footnotes

1

Note that in a quickest detection application the concept of a “false alarm” is different from that in a fixed-block test.

2

Scenario 2 is consistent with the sequences of mean values obtained by the COVID-19 epidemic data observed for different countries [16].

3

The same multiplicative structure shown in (18) applies, other than Inline graphic, to different time-series related to the pandemic evolution, e.g., the number of hospitalized individuals [16].

4

Since Inline graphic and Inline graphic, Inline graphic is negligible, for all Inline graphic. Thus, one can safely assume that Inline graphic is a sequence of independent nonnegative random variables.

Contributor Information

Paolo Braca, Email: paolo.braca@cmre.nato.int.

Domenico Gaglione, Email: domenico.gaglione@cmre.nato.int.

Stefano Marano, Email: marano@unisa.it.

Leonardo Maria Millefiori, Email: leonardo.millefiori@cmre.nato.int.

Peter Willett, Email: peter.willett@uconn.edu.

Krishna R. Pattipati, Email: krishna.pattipati@uconn.edu.

References

  • [1].Page E., “Continuous inspection schemes,” Biometrika, vol. 41, no. 1/2, pp. 100–115, Jun. 1954. [Google Scholar]
  • [2].Basseville M. and Nikiforov I. V., Detection of Abrupt Changes: Theory and Application. Englewood Cliffs, NJ, USA: Prentice-Hall, 1993. [Google Scholar]
  • [3].Poor H. V. and Hadjiliadis O., Quickest Detection. Cambridge, U.K.: Cambridge Univ. Press, 2009. [Google Scholar]
  • [4].Truong C., Oudre L., and Vayatis N., “Selective review of offline change point detection methods,” Signal Process., vol. 167, Feb. 2020, Art. no. 107299. [Google Scholar]
  • [5].Roberts M. et al. , “Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans,” Nature Mach. Intell., vol. 3, no. 3, pp. 199–217, Mar. 2021. [Google Scholar]
  • [6].Hu S. et al. , “Weakly supervised deep learning for COVID-19 infection detection and classification from CT images,” IEEE Access, vol. 8, pp. 118 869–118 883, 2020. [Google Scholar]
  • [7].Anderson R. M., Heesterbeek H., Klinkenberg D., and Hollingsworth T. D., “How will country-based mitigation measures influence the course of the COVID-19 epidemic?” Lancet, vol. 395, no. 10228, pp. 931–934, Mar. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Hellewell J. et al. , “Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts,” Lancet Glob. Health, vol. 8, no. 4, pp. e488–e496, Apr. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Nicola M. et al. , “The socio-economic implications of the coronavirus pandemic (COVID-19): A review,” Int. J. Surg., vol. 78, pp. 185–193, Jun. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Sharif A., Aloui C., and Yarovaya L., “COVID-19 pandemic, oil prices, stock market, geopolitical risk and policy uncertainty nexus in the US economy: Fresh evidence from the wavelet-based approach,” Int. Rev. Financ. Anal., vol. 70, Jul. 2020, Art. no. 101496. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Guan D. et al. , “Global supply-chain effects of COVID-19 control measures,” Nature Hum. Behav., vol. 4, no. 6, pp. 577–587, Jun. 2020. [DOI] [PubMed] [Google Scholar]
  • [12].Millefiori L. M. et al. , “COVID-19 impact on global maritime mobility,” 2020, to be published. [Online]. Available: https://arxiv.org/abs/2009.06960 [DOI] [PMC free article] [PubMed]
  • [13].Tartakovsky A., Nikiforov I., and Basseville M., Sequential Analysis: Hypothesis Testing and Changepoint Detection. Boca Raton, FL, USA: CRC Press, 2014. [Google Scholar]
  • [14].Kay S. M., Fundamentals of Statistical Signal Processing, Volume II: Detection Theory. Upper Saddle River, NJ, USA: Prentice-Hall PTR, 1998. [Google Scholar]
  • [15].Poor H. V., An Introduction to Signal Detection and Estimation. New York, NY, USA: Springer-Verlag, 1988. [Google Scholar]
  • [16].Braca P., Gaglione D., Marano S., Millefiori L. M., Willett P., and Pattipati K., “Decision support for the quickest detection of critical COVID-19 phases,” Sci. Rep., to be published. [Online]. Available: https://arxiv.org/abs/2011.11540 [DOI] [PMC free article] [PubMed]
  • [17].Kermack W. O., McKendrick A. G., and Walker G. T., “A contribution to the mathematical theory of epidemics,” Proc. Roy. Soc. London, vol. 115, no. 772, pp. 700–721, Aug. 1927. [Google Scholar]
  • [18].Skvortsov A. and Ristic B., “Monitoring and prediction of an epidemic outbreak using syndromic observations,” Math. Biosci., vol. 240, no. 1, pp. 12–19, Nov. 2012. [DOI] [PubMed] [Google Scholar]
  • [19].Hu Z., Cui Q., Han J., Wang X., Sha W. E., and Teng Z., “Evaluation and prediction of the COVID-19 variations at different input population and quarantine strategies, a case study in Guangdong province, China,” Int. J. Infect. Diseases, vol. 95, pp. 231–240, Jun. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Maier B. F. and Brockmann D., “Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China,” Science, vol. 368, no. 6492, pp. 742–746, May 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Gaglione D. et al. , “Adaptive bayesian learning and forecasting of epidemic evolution - data analysis of the COVID-19 outbreak,” IEEE Access, vol. 8, pp. 175 244–175 264, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Allen L. J., “A primer on stochastic epidemic models: Formulation, numerical simulation, and analysis,” Infect. Diseases Model., vol. 2, no. 2, pp. 128–142, May 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Li R. et al. , “Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2),” Science, vol. 368, no. 6490, pp. 489–493, May 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Chinazzi M. et al. , “The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak,” Science, vol. 368, no. 6489, pp. 395–400, Apr. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Vervaat W., “On a stochastic difference equation and a representation of non-negative infinitely divisible random variables,” Adv. Appl. Prob., vol. 11, no. 4, pp. 750–783, Dec. 1979. [Google Scholar]
  • [26].Embrechts P. and Goldie C., “Perpetuities and random equations,” in Asymptotic Statistics, Mandl P. and Hušková M., Eds. Heidelberg, Germany: Springer-Verlag, 1994, pp. 75–86. [Google Scholar]
  • [27].Hitczenko P. and Wesolowski J., “Renorming divergent perpetuities,” Bernoulli, vol. 17, no. 3, pp. 880–894, Aug. 2011. [Google Scholar]
  • [28].Soldi G. et al. , “Quickest detection and forecast of pandemic outbreaks: Analysis of COVID-19 waves,” IEEE Commun. Mag., to be published. [Online]. Available: https://arxiv.org/abs/2101.04620
  • [29].Marano S. and Sayed A. H., “Decision-making algorithms for learning and adaptation with application to COVID-19 data,” IEEE Signal Process. Lett., to be published. [Online]. Available: https://arxiv.org/abs/2012.07844 [DOI] [PMC free article] [PubMed]
  • [30].Braca P., Gaglione D., Marano S., Millefiori L. M., Willett P., and Pattipati K., “MAST: COVID-19 pandemic onset test - multi-country analysis and visualization,” 2020. [Online]. Available: https://covid-mast.github.io

Articles from Ieee Signal Processing Letters are provided here courtesy of Institute of Electrical and Electronics Engineers

RESOURCES