Abstract
This letter develops an easily-implementable version of Page's CUSUM quickest-detection test, designed to work in certain composite hypothesis scenarios with time-varying data statistics. The decision statistic can be cast in a recursive form and is particularly suited for on-line analysis. By back-testing our approach on publicly-available COVID-19 data we find reliable early warning of infection flare-ups, in fact sufficiently early that the tool may be of use to decision-makers on the timing of restrictive measures that may in the future need to be taken.
Keywords: COVID-19 pandemic, MAST, pandemic waves, quickest detection
I. Introduction
We develop a version of Page's CUSUM quickest-detection procedure [1]–[4], applicable to a family of composite-hypothesis changes. We refer to it as MAST — the mean-agnostic sequential test. Consider a set of independent Gaussian observations
of constant known standard deviation
and unknown mean sequence
. At an unknown time, the mean switches from being less than some prescribed limit (but otherwise unknown) to larger than some prescribed limit (but otherwise unknown). The goal is to detect the change, if any, as soon as possible. This framework represents a convenient abstraction of many problems of practical interest. Here we discuss its application to the detection of COVID-19 pandemic waves.
The outbreak of the COVID-19 infection is certainly one of the most serious global crises of the last two decades. The response of the research community was also extraordinary, and comprehensive reviews are recently appearing in the literature [5], [6]. To contain the “first wave” of the COVID-19 pandemic in the spring of 2020, strict lockdown measures were imposed in many countries, with huge societal and economic costs [7]–[12]. In the fall of 2020, a “second pandemic wave” seems to have grown in many regions of the world, and governments and authorities were again faced with the dilemma of if and when to impose social restrictions. In this work, after developing the MAST quickest detection procedure, we show how it can provide valuable support to make informed and rational decisions, with a focus on detecting the second and subsequent waves of the COVID-19 pandemic.
II. MAST: A Novel Quickest Detection Test
Along the same lines of the derivations of Page's test, see e.g., [2, Sec. 2.2.3] or [13, Sec. 8.2], we consider the following decision problem involving two statistical hypotheses with independent data:
![]() |
In (1),
are the data available to the decision maker,
is an unknown deterministic change time and the standard deviation
is assumed known. Note in (1) that in the case
, the alternative hypothesis is equivalent to the null one, i.e., there is no change in regime. Different from the classical assumption of Page's test, in our problem the expected values before and after the change are unavailable. Accordingly, we model
and
as unknown deterministic sequences and we assume that they satisfy the following constraints:
![]() |
Thus, model (1) contains
unknown parameters: the index of change
and the two sequences of expected values. In (2), if
represents the ratio of daily positive cases in a region, the most natural choice is
, but it is convenient to consider the general case having an implied hysteresis. For example,
may be specified based on tolerable time to reach hospital capacity, while
may be based on the time citizens can endure restrictions before reopening the economy or tolerable level of positive cases.
One might also consider
![]() |
in place of (2). In some sense, this might be more natural, since the mean levels before and after the change are still assumed unknown, but are merely constant. However, formulation (3) does not admit a recursive Page-like procedure whereas MAST that results from (2) does.
According to the Generalized Likelihood Ratio Test (GLRT) principle [14], [15], the decision statistic for problem (1) is
![]() |
where the equality follows by recognizing that each factor of the products involves a single value of
or
and making explicit the constraints in (2). The suprema over
and
appearing in the above expression can be computed in closed form, as follows:
![]() |
which means that the ML (maximum likelihood) estimates of the unknown parameters are, respectively,
![]() |
This yields the GLRT statistic in the form
![]() |
or, equivalently, taking the logarithm:
![]() |
where
![]() |
The passage from the controlled to the critical regime is declared at the smallest
such that
![]() |
where the threshold level
is selected to trade-off decision delay and risk, two quantities that will be defined in Section III.
The test in (10) will be referred to as MAST
with boundaries
and
. The subscript
appended to
denotes its dependence on the stream of data
, and the subscript
appended to
denotes its dependence on
. Finally, by introducing the non-linearity
![]() |
we have
.
As a sanity check, let us assume that values of
closer to
are confused with
and, likewise, values of
closer to
are confused with
. Then, we see from (9) that the contribution to
provided by the sample
is
, where the negative sign applies to the former case and the positive one to the latter. In the actual operation of
, the contribution given by the sample
is regulated by its distance to the boundaries, as shown in (11):
-
•
values
give a negative contribution proportional to the square of the distance of
from the upper boundary
; -
•
values
give a linear contribution, whose sign depends on which boundary
is closest to; -
•
values
give a positive contribution proportional to the square of the distance of
from the lower boundary
.
Using the non-linearity of (11) in (8), one gets
![]() |
where we have used
.
The MAST
decision statistic (12) can be expressed in recursive form. To see this, let us define
, with
,
. By using the notation
, we see that (12) can be written as
. Then,
![]() |
We have thus arrived at a recursive expression for the decision statistic:
and, for
,
, if
, and
, otherwise. Equivalently:
, and, for
,
![]() |
We now consider two special cases. First, let
, a case referred to as the MAST
detector, with decision statistic
and, for
,
![]() |
Further assuming
in (15), yields a decision procedure that we simply call MAST, whose decision statistic
is denoted by
:
and, for
,
![]() |
The second special case is when
and
, for some
, which is relevant in connection to Page's test, as discussed next. As is well-known, if the mean values of the observed sequence before and after the change are constant and known, say
and
, the statistic to be compared to a suitable threshold level would be the CUSUM [1]–[3]:
and, for
,
![]() |
For
, Eq. (11) gives
, which shows that the decision statistic
in (12) operates exactly as the Page's test for samples
.
Different optimality criteria have been advocated for the CUSUM test. The “first-order” criterion considers the asymptotic situation in which the mean time between false alarms goes to infinity and asserts that the CUSUM minimizes the worst-case mean delay, where the qualification “worst” refers to both the change time and the behavior of the process before change [2, p. 166]. The test based on (17) is in this sense the optimal quickest-detection Page's test.
It is worth noting that the MAST statistic in (16) is formally obtained by replacing the unknown value of
appearing in the CUSUM statistic, with an estimate
(constant factors can be incorporated in the threshold). This suggests an analogy between MAST for quickest-detection problems and the energy detector for testing the presence of an unknown time-varying deterministic signal buried in Gaussian noise, in the classical hypothesis testing framework [14].
III. Performance Assessment
The performance of MAST
is expressed in terms of mean delay time
and the risk
. The mean delay
is the difference between the time at which the MAST
statistic
crosses a preassigned threshold level
, see (10), and the time of passage from the controlled to the critical regime. In the critical regime, the pandemic grows exponentially fast and it is therefore important to ensure that
be as small as possible. This requirement is in contrast with the requirement
. The risk
is defined as the reciprocal of the mean time between two false alarms.1 In turn, the mean time between false alarms is the mean time between two threshold crossings, assuming that the decision statistic is reset to zero at any threshold crossing event, occurring in the controlled regime. Because of the unwelcome social and economic impact of the measures presumably taken by the authorities when passage into the critical regime is detected, it is evident that
must be extremely small. The same performance indices
and
used to characterize MAST
are used for the Page's test.
We now investigate the performance of MAST
by computer experiments, limiting the analysis to the case
, i.e., the simple MAST. The performance of the Page's test is used as a benchmark. Let us consider the following “scenario 0”. Fix
. Suppose that the state of nature (mean value of the
's) is
for all
in the controlled regime; likewise, suppose
for all
in the critical regime. By standard Monte Carlo counting, for MAST we found that the delay
varies almost linearly with the threshold level
, and that
varies almost linearly with
. The same approximate behavior is found, again by standard Monte Carlo counting, for the clairvoyant Page's test that is aware of the mean values
and
: the mappings
and
are approximately linear. These numerical analyses are not detailed for the sake of brevity. The observed behavior is known for the Page's test, at least when the threshold
is sufficiently large, in view of the Wald's approximation, see, e.g. [2, Eq. 5.2.44]. In the present Gaussian case, more accurate formulas — known as Siegmund's approximations — are also available [2, Eqs. 5.2.64, 5.2.65].
We assume that the aforementioned linear mappings observed for MAST and Page's test hold true for any value of the threshold, and this assumption allows us to consider values of the mean delay and (especially) of the risk that would be difficult to obtain by standard Monte Carlo analysis. In this way, we obtain the operational curve of the two decision systems shown in Fig. 1. The operational curve is the relationship between
and
. As expected, Page's test outperforms the MAST, because the Page's test is optimal for the case addressed in scenario 0.
Fig. 1.
Operational characteristic (risk
versus decision delay
) of the MAST quickest detection test, compared to the benchmark Page's test. Three scenarios are considered, as described in the main text. In scenario 0, Page's test is optimal. MAST outperforms Page's test in scenarios 1 and 2, in which the sequences
and
are time-varying. Scenario 2, in particular, mimics the actual behavior of the sequences, as observed in COVID-19 pandemic data, see Section IV.
The same numerical analysis has been conducted for “scenario 1” and “scenario 2,” also shown in Fig. 1. In scenario 1, we suppose that in the controlled regime, any
is an instantiation of a uniform random variable with support
, while in the critical regime any
is an instantiation of a uniform random variable with support
. In scenario 2, instead, we suppose that the sequences
and
are sinusoidal with a period of 75 days.2 Specifically, in the controlled regime the sinusoid oscillates in
, while in the critical regime it oscillates in
. To implement the Page's test in both scenarios 1 and 2, it is assumed that the mean values are constant, i.e.,
and
, as in scenario 0. Clearly, no assumption about the mean values is instead needed for implementing the MAST test, except that they are bounded by one. In Fig. 1, we see that MAST outperforms Page's test, confirming its effectiveness when the mean values
and
are unknown, except for being bounded as shown in (2).
IV. Application to COVID-19 Pandemic Data
Starting from the landmark SIR model developed in [17], a multitude of sophisticated epidemiological models have been proposed to describe the pandemic evolution, based, e.g., on stochastic evolution of epidemic compartments [18]–[22], or metapopulation networks, [23], [24], just to cite two examples. The trend in the topical literature is to conceive increasingly complex models, often suitable for analysis by big-data techniques. The main goal of these models is to predict mid/long-term evolution of the infection. Our focus, instead, is to quickly detect the onset of the exponential growth. With this aim, we consider an abbreviated observation model, built on the concept that the pandemic evolution is essentially a multiplicative phenomenon.
We model the number of new positive individuals on day
, say
, as the number
of new positive individuals on day
, multiplied by a random variable
. Further including a “noise” term
, yields the scalar discrete-time state equation
,
, for some initial state
. Such a recursion, under various assumptions for the sequences
, is known as a perpetuity and appears in many disciplines [25]–[27]. We assume that the noise term
is negligible, yielding:3
![]() |
for some
. In this article, we refer to model (18), in which
are independent random variables. This is akin to the popular random walk model, with the independence of the increments of the random walk replaced by the independence of the ratios
. Model (18) is derived from SIR-like models and validated on COVID-19 data in [16], where it is also shown that the
's closely follow a Gaussian distribution with (unknown) time-varying expected value
, and a common standard deviation4
.
As long as
, the sequence
tends to decay exponentially to zero, while, for
,
tends to increase exponentially fast. We are interested in quickly detecting the passage from the former situation (a controlled regime) to the latter (critical). Detecting this change can be cast in terms of a binary decision problem between two hypotheses, referred to as the null and the alternative, as shown in (1).
An example of application of MAST to COVID-19 data is provided in Fig. 2. The abscissa point at which the MAST statistic crosses the threshold represents the day at which the onset is detected. The test threshold is state-dependent, as discussed in [16]. Then, for clarity of illustration, only the smallest and largest thresholds corresponding to the risk
are shown, which for many states makes only a few days difference as to the time of alert. One observation is that restrictive measures have not been adopted in as timely a manner as suggested by the MAST analysis. The reader is referred to [12], [16], [28]–[30] for details. Several aspects of the MAST analysis of COVID-19 data deserve further study. These include the pre-processing to clean the data from gross errors (e.g., asynchronous or unreported data); generalization of the approach to analyze other publicly available time-series (e.g., number of hospitalized, number of deaths) and even as a vector of observations; on-line estimation of the variance to make the detector robust to statistical fluctuations, often observed in COVID-19 data.
Fig. 2.
MAST decision statistic computed for 10 US states and used to detect the onset of the COVID-19 s wave. The dashed horizontal lines represent the smallest and largest thresholds corresponding to
, for the ensemble of the ten states. Curves are prolonged beyond threshold crossing for clarity.
V. Conclusion
This article derived a sequential test called MAST, which is used in [16] to detect passage from the controlled regime in which the COVID-19 pandemic is restrained, to the critical regime in which the infection spreads exponentially fast. MAST is a variation of the celebrated Page's test based on the CUSUM statistic, designed for cases in which the expected values of the data are bounded below a lower barrier
in the controlled regime, and above an upper barrier
in the critical one, but are otherwise unknown. We show that MAST admits a recursive form and in the simplest case
, is formally obtained from the Page's test with nominal expected values
, by replacing
with an estimate thereof. The performance of MAST is investigated by computer experiments. If the expected values of the data are constant and known, the performance loss of MAST with respect to the optimal Page's test is moderate. In pandemic scenarios, lacking knowledge of the expected values of the data, MAST can well overcome the Page's test designed with nominal values of the unknowns.
Funding Statement
The work of Krishna R. Pattipati was supported in part by the U.S. Office of Naval Research, in part by the U.S. Naval Research Laboratory under Grants N00014-18-1-1238 and N00173-16-1-G905, and in part by the Space Technology Research Institutes from National Aeronautics and Space Administration's (NASA's) Space Technology Research Grants Program under Grant 80NSSC19K1076. The work of Peter Willett was supported by the AFOSR under contract FA9500-18-1-0463.
Footnotes
Note that in a quickest detection application the concept of a “false alarm” is different from that in a fixed-block test.
Scenario 2 is consistent with the sequences of mean values obtained by the COVID-19 epidemic data observed for different countries [16].
The same multiplicative structure shown in (18) applies, other than
, to different time-series related to the pandemic evolution, e.g., the number of hospitalized individuals [16].
Since
and
,
is negligible, for all
. Thus, one can safely assume that
is a sequence of independent nonnegative random variables.
Contributor Information
Paolo Braca, Email: paolo.braca@cmre.nato.int.
Domenico Gaglione, Email: domenico.gaglione@cmre.nato.int.
Stefano Marano, Email: marano@unisa.it.
Leonardo Maria Millefiori, Email: leonardo.millefiori@cmre.nato.int.
Peter Willett, Email: peter.willett@uconn.edu.
Krishna R. Pattipati, Email: krishna.pattipati@uconn.edu.
References
- [1].Page E., “Continuous inspection schemes,” Biometrika, vol. 41, no. 1/2, pp. 100–115, Jun. 1954. [Google Scholar]
- [2].Basseville M. and Nikiforov I. V., Detection of Abrupt Changes: Theory and Application. Englewood Cliffs, NJ, USA: Prentice-Hall, 1993. [Google Scholar]
- [3].Poor H. V. and Hadjiliadis O., Quickest Detection. Cambridge, U.K.: Cambridge Univ. Press, 2009. [Google Scholar]
- [4].Truong C., Oudre L., and Vayatis N., “Selective review of offline change point detection methods,” Signal Process., vol. 167, Feb. 2020, Art. no. 107299. [Google Scholar]
- [5].Roberts M. et al. , “Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans,” Nature Mach. Intell., vol. 3, no. 3, pp. 199–217, Mar. 2021. [Google Scholar]
- [6].Hu S. et al. , “Weakly supervised deep learning for COVID-19 infection detection and classification from CT images,” IEEE Access, vol. 8, pp. 118 869–118 883, 2020. [Google Scholar]
- [7].Anderson R. M., Heesterbeek H., Klinkenberg D., and Hollingsworth T. D., “How will country-based mitigation measures influence the course of the COVID-19 epidemic?” Lancet, vol. 395, no. 10228, pp. 931–934, Mar. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Hellewell J. et al. , “Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts,” Lancet Glob. Health, vol. 8, no. 4, pp. e488–e496, Apr. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Nicola M. et al. , “The socio-economic implications of the coronavirus pandemic (COVID-19): A review,” Int. J. Surg., vol. 78, pp. 185–193, Jun. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Sharif A., Aloui C., and Yarovaya L., “COVID-19 pandemic, oil prices, stock market, geopolitical risk and policy uncertainty nexus in the US economy: Fresh evidence from the wavelet-based approach,” Int. Rev. Financ. Anal., vol. 70, Jul. 2020, Art. no. 101496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Guan D. et al. , “Global supply-chain effects of COVID-19 control measures,” Nature Hum. Behav., vol. 4, no. 6, pp. 577–587, Jun. 2020. [DOI] [PubMed] [Google Scholar]
- [12].Millefiori L. M. et al. , “COVID-19 impact on global maritime mobility,” 2020, to be published. [Online]. Available: https://arxiv.org/abs/2009.06960 [DOI] [PMC free article] [PubMed]
- [13].Tartakovsky A., Nikiforov I., and Basseville M., Sequential Analysis: Hypothesis Testing and Changepoint Detection. Boca Raton, FL, USA: CRC Press, 2014. [Google Scholar]
- [14].Kay S. M., Fundamentals of Statistical Signal Processing, Volume II: Detection Theory. Upper Saddle River, NJ, USA: Prentice-Hall PTR, 1998. [Google Scholar]
- [15].Poor H. V., An Introduction to Signal Detection and Estimation. New York, NY, USA: Springer-Verlag, 1988. [Google Scholar]
- [16].Braca P., Gaglione D., Marano S., Millefiori L. M., Willett P., and Pattipati K., “Decision support for the quickest detection of critical COVID-19 phases,” Sci. Rep., to be published. [Online]. Available: https://arxiv.org/abs/2011.11540 [DOI] [PMC free article] [PubMed]
- [17].Kermack W. O., McKendrick A. G., and Walker G. T., “A contribution to the mathematical theory of epidemics,” Proc. Roy. Soc. London, vol. 115, no. 772, pp. 700–721, Aug. 1927. [Google Scholar]
- [18].Skvortsov A. and Ristic B., “Monitoring and prediction of an epidemic outbreak using syndromic observations,” Math. Biosci., vol. 240, no. 1, pp. 12–19, Nov. 2012. [DOI] [PubMed] [Google Scholar]
- [19].Hu Z., Cui Q., Han J., Wang X., Sha W. E., and Teng Z., “Evaluation and prediction of the COVID-19 variations at different input population and quarantine strategies, a case study in Guangdong province, China,” Int. J. Infect. Diseases, vol. 95, pp. 231–240, Jun. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Maier B. F. and Brockmann D., “Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China,” Science, vol. 368, no. 6492, pp. 742–746, May 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Gaglione D. et al. , “Adaptive bayesian learning and forecasting of epidemic evolution - data analysis of the COVID-19 outbreak,” IEEE Access, vol. 8, pp. 175 244–175 264, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Allen L. J., “A primer on stochastic epidemic models: Formulation, numerical simulation, and analysis,” Infect. Diseases Model., vol. 2, no. 2, pp. 128–142, May 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Li R. et al. , “Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2),” Science, vol. 368, no. 6490, pp. 489–493, May 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].Chinazzi M. et al. , “The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak,” Science, vol. 368, no. 6489, pp. 395–400, Apr. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Vervaat W., “On a stochastic difference equation and a representation of non-negative infinitely divisible random variables,” Adv. Appl. Prob., vol. 11, no. 4, pp. 750–783, Dec. 1979. [Google Scholar]
- [26].Embrechts P. and Goldie C., “Perpetuities and random equations,” in Asymptotic Statistics, Mandl P. and Hušková M., Eds. Heidelberg, Germany: Springer-Verlag, 1994, pp. 75–86. [Google Scholar]
- [27].Hitczenko P. and Wesolowski J., “Renorming divergent perpetuities,” Bernoulli, vol. 17, no. 3, pp. 880–894, Aug. 2011. [Google Scholar]
- [28].Soldi G. et al. , “Quickest detection and forecast of pandemic outbreaks: Analysis of COVID-19 waves,” IEEE Commun. Mag., to be published. [Online]. Available: https://arxiv.org/abs/2101.04620
- [29].Marano S. and Sayed A. H., “Decision-making algorithms for learning and adaptation with application to COVID-19 data,” IEEE Signal Process. Lett., to be published. [Online]. Available: https://arxiv.org/abs/2012.07844 [DOI] [PMC free article] [PubMed]
- [30].Braca P., Gaglione D., Marano S., Millefiori L. M., Willett P., and Pattipati K., “MAST: COVID-19 pandemic onset test - multi-country analysis and visualization,” 2020. [Online]. Available: https://covid-mast.github.io




















