Highlights
-
•
A new way of regrouping epidemic reported data.
-
•
Construction of wavelets for the unadjusted and adjusted reported data.
-
•
Practically implementable method to construct true epidemic in an emerging epidemic.
Keywords: Partial to complete data, Convergence of graphs, Wavelets
Abstract
In this paper, we have proposed a two-phase procedure (combining discrete graphs and wavelets) for constructing true epidemic growth. In the first phase, a graph-theory-based approach was developed to update partial data available and in the second phase, we used this partial data to generate plausible complete data through wavelets. We have provided two numerical examples. This procedure is novel and implementable and adaptable to machine learning modeling framework.
1. Basis, motivation and introduction
In general, it is not easy to build a true epidemic growth curve in a timely fashion for any newly emerging epidemic, and the chances of building a true growth picture worsen with poor disease reporting. As we know, preparedness for the epidemic spread is a primary public health concern for any health department. Epidemic reporting of a disease is a fundamental event in understanding two key parameters in epidemiology, namely, epidemic diffusion within a population and growth at a population level. Normally, for a real-time epidemic, the reporting of cases is rarely complete, especially if the epidemic is new or symptoms are unknown or are yet to be discovered. For viruses with shorter incubation periods without virus shedding during the incubation period, any delay in reporting or lack of reporting could lead to a severe epidemic due to the absence of control measures. For example, for the Ebola virus, the average incubation period is between 2 and 21 days, and an individual diagnosed with the Ebola virus does not spread the virus to others during this period. If some of these individuals with Ebola are not diagnosed (and hence not reported to health care facilities) then after 21 days, these individuals will (unknowingly) spread the virus to others. There are other viruses whose incubation period is short but they are contagious during this period, for example, influenza. Even for epidemics with established symptoms, reporting could be nowhere close to complete. The impact of reporting on epidemic surveillance can then be theoretically measured (Rao, 2012).
Wavelet theory was developed fairly recently (about 35–40 years ago), but applications of wave functions fitted to epidemic data were seen more than 60 years ago (for example, Soper, 1929, Bartlett, 1956, and Bartlett, 1957). The observations by Bartlett (1957) on population size and the fraction of measles infecting people over various time points provided a new basis for using stochastic models for understanding retrospectively the seasonality of epidemics. This study by Bartlett (1957) and other studies (for example, see the summary in the book by Anderson and May, 1991) consider wave-like fitting of a time series data of epidemics. Bartlett, in his articles Bartlett, 1956, Bartlett, 1957 considered two differential equations to investigate small oscillations around the equilibrium. Earlier than this, a stochastic formulation of such models was seen by Soper in his paper Soper (1929).
Suppose i(t) and s(t) represent infected and susceptible numbers at time t. A simple ODE considered by Bartlett (which was improvisation of the model by Soper (1929)) to explain the dynamics of i(t) and s(t) was as follows:
(1.1) |
Here μ (infection related death rate), ν (immunization coefficient = m/n), λ (infectivity coefficient = n/μ) and are all positive constants. Bartlett considered (for u ≥ 0) and ( ≥ 0) around the equilibrium values of i(t) and s(t) in the model (1.1) to obtain a revised model as follows:
(1.2) |
The solutions of the model (1.2) after ignoring higher order term uv are,
where and Using the solutions u and v above, Bartlett created an artificial epidemic curve to demonstrate oscillatory behavior of infecteds.
Some studies focused on understanding phase transitions and analysis of angles generated at each time step within an epidemic wave (for example, Anderson, Grenfell, May, 1984, Earn, Rohani, Grenfell, 1998, Grenfell, Bolker, Kleczkowski, 1995). Fitting wave functions from time series data on epidemics, the researchers also focused on constructing specific wavelets as part of the epidemiology modeling, for example, Morlet’ s wave functions (Grenfell, Bjø rnstad, Kappey, 2001, Bauch, Earn, 2003, Bauch, Earn, 2003, Bauch, 2008, Cazelles, Cazelles, Chavez, 2014). All these epidemiology studies that adapted a wave function or wave series have primarily focused on fitting a curve to epidemic data and some of them conducted mathematical analysis of their models. Such kinds of studies provide us a great amount of information on the utility of wave functions in epidemiology, as well as other mathematical and qualitative properties of epidemic waves.
In this study, we are trying to investigate a classical problem in epidemic reporting within a novel framework using harmonic analysis principles. See the flow chart in Fig. 1.1 for an overview of the proposed framework. This study develops methodologies of constructing complete data from partial data using wavelets (by partial data here we mean incomplete data that typically arises under an emerging epidemic, for example, in the novel coronavirus of 2019 or COVID-19). Our approach will be useful in deep learning of very large data structures and creating complete data from partial data. During most of the emerging epidemics, we will be handling only partial data, corresponding full data would be seldom available. The proposed ideas, hence, could be helpful in artificial intelligence (AI) based algorithms to create disease surveillance.
2. Fundamental questions
We are raising here very fundamental questions in epidemic reporting. For example, does an epidemic case reported over time follow any pattern? Or, in any particular situation, does an epidemic reporting pattern have anything to do with the actual epidemic wave? Actual epidemic or a true epidemic wave is the number of all cases (reported and not reported) as a function of time, i.e. it could be seen as a function obtained as discretely plotting the number of infected people reported against time points. Is there any strong or weak association between “epidemic reporting patterns” and “epidemic waves” in general? Suppose we cannot generalize such an association for every epidemic, then will there be any such association for a particular epidemic?
In any epidemic, we can hardly observe the actual (true) epidemic wave, and what we construct as a wave is mostly based on reporting numbers of disease cases. The central questions in which we are interested can be summarized as follows:
-
(i)
How far the reporting of an epidemic is helping us in accurate prediction of an epidemic, especially if it is an emerging epidemic? how far such an association is clarified (which it is not otherwise) using methods of harmonic analysis?
-
(ii)
It is seldom that the cases generated in a population are completely detected, so the question that remains unanswered most of the time in a newly emerging epidemic is: Will there be any way to back-calculate and reproduce these numbers lost in detection and, if so, to what extent can we reconstruct accurately an epidemic growth (before control measures are implemented)?
There are other related questions but first we want to view matters from the lens of wavelet/harmonic analysis because we believe there could be some useful light to be shed in this way.
It is always challenging to construct true epidemic waves because population vaccination and control policies depend on understanding the true ground level reality of disease cases. Before we treat wavelets in this new context, we define the Fourier series and Fourier tranform of a function2 .
3. Wavelets
In the past thirty-five years there has developed a new branch of harmonic analysis called wavelet theory—see Meyer and Ryan (1993), Meyer (1998), Hernández and Weiss (1996), Walker (1997), Strichartz (1993), Krantz, 1999, Krantz, Krantz and Labate et al. (2013). Largely based on the ideas of Yves Meyer, wavelet theory replaces the traditional Fourier basis of sine functions and cosine functions with a more flexible and adaptable basis of wavelets. The advantages of wavelets are these:
-
(a)
The wavelet expansion of a function can be localized both in the time and the space variables;
-
(b)
The wavelet expansion can be customized to particular applications;
-
(c)
Because of (b), the wavelet expansion is more accurate than the traditional Fourier expansion and also more rapidly converging.
Wavelet theory has revolutionized the study of image compression, the study of signal processing, and the study of partial differential equations. It will be a powerful new tool in the study of epidemiology, particularly in the analysis of epidemic growth curves.
4. Theoretical strategy
This study will develop methodologies of constructing complete data from partial data using wavelets.
First, we propose to build a true wave of an epidemic (through harmonic analysis set-up and assumptions) that is otherwise unknown directly. Then, by assuming that a fraction of this constructed wave was reported out of a true wave, we will determine how an observed wave appears. These fractions are variables, so we will have several patterns of waves representing one true epidemic wave. We will draw conclusions as to which one of these representations is an ideal candidate for building a true epidemic. There will be some noise in our modeling of the epidemic curve so we will use noise reduction techniques before finalizing a pattern.
Suppose an epidemic wave was observed within a time interval [t 0, tn], where could be in weeks, months, or years. Suppose that [t 0, tn] is partitioned into a set S of sub-intervals where could be in days or weeks depending upon the situation. Let ai and bi be the number of cases reported and number of cases that occurred but were not reported, respectively, within the interval for Let f be the function whose domain is the set of time intervals and whose range is the set (see Fig. 4.1 ). Here f need not be a one-to-one function because two time intervals within S could have the same number of epidemic cases. Let f 1 be the function defined as where (see Fig. 4.2 ). We call f 1 a fractional function of f because it maps each time interval into a corresponding number of reported cases in each of those intervals. The total fraction of reported cases during [t 0, tn] is distributed into n time intervals. Note that
Given that f 1 is known, the question will be whether we can estimate (or speculate about) f. Once we are able to estimate some form of f, then how can we test for accuracy of the form(s) obtained? We could define another fraction function as
where
and could attempt to develop techniques to estimate (or speculate about) f from f 2. In Fig. 4.2, the fractional epidemic wave pattern is not fully describing a true epidemic wave pattern. Purely from a fractional epidemic wave, it is not easy to speculate a true epidemic wave pattern. Additional information on bi values is needed for better prospects in speculation of f.
This work is innovative because wavelets have never before been used to build on partial epidemic information. The wavelets thus built will be retrospectively updated. This will be a new technology to improve reported data and to re-construct epidemic dynamics. These methods, we anticipate, will be innovative for several other medical science applications.
4.1. Generating wavelets from sampled epidemic data:
Let us consider the availability of data as per Fig. 4.1 on reported cases. Suppose each point on the y-axis is considered as a sampled point (out of many sets of plausible reported cases at that point). We call it a sample point because we are not sure that the point we obtain at any time intervals [t 0, t 1] and for for the reported cases represents a true epidemic curve, which is one of the main assumptions in this paper. The total number of reported cases within a time interval is a combination of cases that were reported to the public health system (see Fig. 4.4). When we know the total number of reported cases within a time interval, then this number could be the result of one of the several combinations of cases reported as shown in Fig. 4.4.
True number of cases occurring in a population is often unknown, and what we know often is the number of reported cases. Let ai be the number of reported cases in the ith time interval, which is to say, the result of one of the several possible combinations of actual cases in the population.
Let Ti be the number of total cases in ith time interval, and let it be
where, si is the sth individual infected in the interval i. Then ai be the sum of the cases, a number formed with one combination of elements of Ti. There could be several such combinations which could lead to the same value ai. Hence ai can be treated as one such sample combination out of many possible combinations. The set of cases which led to form ai we call here the support of ai and explain further in the following paragraphs.
Within each interval, the combination of cases reported is unknown but the total reported cases out of actual disease cases within each interval is fixed (because we will take a single point reference of total reported cases within each interval). These reported cases are ais in Fig. 4.1. Given that there exists a sampled point within each of the time intervals, and with some support for each of the ai (say, supp(ai)), we will construct a wavelet for each time interval [t 0, t 1] and for . When we say a sample point at a time interval we mean the final combination of cases reported (out of actual cases) that were considered as a final reporting number for that interval. We will use data that was used to get the sample point ai.
In Fig. 4.4, for the interval [t 0, t 1], the arcs connecting each of the black circles (vertices) within each square form a graph. Although the sizes of each of these graphs are the same, i.e. 7, their shapes are different and the sampled point is 7. The sampled point cannot be easily used to represent the shape of the graph unless the location of each node (in this case a physical address or geographical location of each node) is known. One way to construct the support could be from the graph associated with each sample point. Using pairs of information {ai, supp(ai)}, we will construct wavelets as shown in Fig. 4.3 . Sampled points within each interval [t 0, t 1] and for gives the number of reported cases and supports constructed from the graphs within each of these intervals. Within each square or rectangle in Fig. 4.4, if the size of a graph increases to the maximum possible size (i.e., when all cases are reported), then the information to construct the corresponding support increases.
Let Gi be the graph (with n(ai) vertices and m(ai) edges) corresponding to a sampled point ai for the interval and suppose that is the graph (with n(Ti) vertices and m(Ti) edges, where Ti is the complete or total infected cases as defined earlier) with all possible reported cases being reported, then, let us say, (Gi converges to ) for all i. As the reported number of cases increases and it approaches towards the number of complete/actual cases, then the graph Gi would ideally expand (in terms of vertices and edges) towards . We refer here to this process as convergence. That is, the structure of two graphs Gi and get closer. In the number of vertices is the number of actual disease cases and the edges are connected between the closest vertices. In reality, is not possible to draw because we would not be able to observe all cases. It is challenging to understand beforehand what fraction of the size of would be the size of Gi, and this guess could give the speed of convergence to from Gi. The actual time steps taken from Gi to for each i is not constant. We assume that there will be a finite number of time steps to reach from Gi to Usually c is not constant as well because the error rates vary so we let c 0 correspond to complete reporting at t 0, c 1 at t 1, and so on up to cn at tn. Let be the graph at t 0 and be the graph at t 1 and so on. The corresponding sizes of graphs will be for and and
(4.1) |
But the inequality
need not hold. The explanation for these inequalities is as follows: graphs within each time interval could converge toward actual disease cases but the size of the graph across various time intervals need not follow any monotonic property because degrees of error in reported cases could vary over time.
For simplicity, let Gi be represented by (Vi, Ei) and be represented by where . As the reporting of diseases cases improves the values of (Vi, Ei) increases so that they become exactly We denote here as . See Fig. 4.5 .
We define as the local steady-state values and as global steady-state value. The maximum value of reported cases within the interval i in (4.1) is referred as the local steady-state for the ith interval. Then, there will be no further improvement of reported cases in that interval. Similarly, global steady-state refers to the maximum number of reported cases for an epidemic across the intervals.
Theorem 1
The size of each graph within could reach local steady-state and the global steady-state is equal to one of the local steady-states.
Proof
Suppose the values are unique such that
There will be only one value of and there will be n such maximum values. One of the n maximum values is the maximum reported cases in one of the n time intervals, so that global steady-state is one of the local steady states. Alternatively, let,
but such that
In this situation also, there will be n maximum values and one of the values
is the global maximum. If
then two local steady-states will be equal to the global steady-state. □
For each i, and Gi are associated with reported cases. For any i, when then Gi and are identical, and this situation refers to complete reporting of disease cases. If for any i, then local steady-state for this i attained at t 0.
Remark 2
If for each i and then the global steady-state also attains at t 0. If for all i, then the global stead-state attains at a time greater than t 0.
When for each i, then the global steady-state value could provide information on degree of reporting error to some extent. If for some i, and by chance at this i, the global steady-state occurs then that would not provide any information on the degree of reporting errors, because at several other i values we will have and actual total epidemic cases are more than sample epidemic cases.
Above, statements in Proposition 1 and in the Remark 2 will alter when multiple reporting exists in one or more of the time intervals considered. Multiple reporting of cases is usually defined as reporting of a disease case more than once and treating it as more than one event of disease occurrence. When multiple reporting exists at each i, then is not the global steady-state. A mixed situation where multiple reporting and under-reporting simultaneously exists within the longer time interval [t 0, tn] is treated separately.
With this method, we will develop a series of wavelets.
Given the information to construct Fig. 4.5(a), and the rapidity at which this graph evolves from the Fig. 4.5(a) to the Fig. 4.5(d) (which we might refer to above as support) to attain the Fig. 4.5(d) is known, then, combined with the information stored in Fig. 4.6 , we can then construct Fig. 4.3. Within each of the intervals [t 0, t 1] and for the information of the Figs. 4.5–4.6, will be used to construct a series of wavelets.
In the next few paragraphs, we will describe the philosophy of wavelets and Meyer wavelets. Why Meyer wavelets? Because they are orthogonal, smooth, and rapidly vanishing. Thus they are more stable.
Suppose two functions Ψ(t) and Φ(t) together describe the epidemic wave of a true epidemic, and if pairs of functions{(Ψi(t), Φi(t))} for represent the epidemic wave of those representing fractions of this true epidemic. Then one of our central ideas is in determining which of these fractional waves (as described in section 4) is closest to the true epidemic. Usually data/information to construct a couple of such fractional wavelets could be observed in an emerging epidemic, say (Ψa(t), Φa(t))} and (Ψb(t), Φb(t)), so the first step is to construct these pairs of wavelets. These fractional wavelets are constructed on partial data (partial in the sense that observed data on disease cases in an emerging epidemic is not complete). The question we are attempting is: can we predict (Ψ(t), Φ(t)) from either one of or from both of the fractional wavelets? [Note: There is no terminology of “fractional wavelet” in the literature, but we are calling (Ψa(t), Φa(t)) and (Ψb(t), Φb(t)) the fractional wavelets]. For this, let us define Meyer wavelets which are readily available and could be a good first step to start with to explain our epidemic situation. We define the Meyer wavelet and briefly describe them below:
The Meyer wavelet is an orthogonal wavelet created by Yves Meyer. It is a continuous wavelet, and has been applied to the study of adaptive filters, random fields, and multi-fault classification.
Definition 3
The Meyer wavelet is an infinitely differentiable function that is defined in the frequency domain in terms of a function ν as follows:
Here
There are other possible choices for ν.
The Meyer scaling function is given by
Of course it holds, as usual, that
and
for some periodic m 0(ω/2). Finally,
It turns out that the wavelets
form an orthonormal basis for the square integrable functions on the real line.
One proposition that could be formed is “if a wavelet is constructed on the partial data of a particular series of events in a population, then this wavelet can not be fully compared with a wavelet constructed from the full data series of all events in the same population.” Building a measure associated with these two wavelets is interesting, and there could be several such measures based on the level of completeness in the data. Because we are dealing with true versus reported disease cases, this measure (a set of points each representing a distance between true and observed cases) could be termed the error in the reporting of disease cases. These kinds of measures will be very helpful (such measures after further filtration can be useful for practical epidemiologists). Instead of constructing wavelets for the overall epidemic duration, we will construct wavelets within intervals [t 0, t 1] and for described as in Fig. 4.3. As the reported cases within an interval improve, described as in Fig. 4.5, the wavelet configuration improves. Each of the fractional wavelets obtained from partial data will be updated using the information shown in Fig. 4.7 . Meyer wavelets for various equally spaced intervals are demonstrated in Fig. 4.8 .
4.1.1. Computation
Suppose a sample point is obtained as explained in the Section 4.1 for an interval . An improvement of reported cases is ascertained from the data obtained in the subsequent time intervals. One way to update this is from future epidemic cases that were infected and/or diagnosed for the period but made available during any of the intervals for That is, the sum of the epidemic cases that were reported during and those reported during each of the future time intervals for and that belong to the interval will be treated as an improved number of reported cases for the interval . We will update the reported number of cases in a previous interval from the future available reported cases that were associated with the previous time interval. Hence the evolution of the data for the interval can be used to construct graphs as shown in Fig. 4.5. Since this evolution is assumed to be observed for a long period, it is also assumed that Gi will be approximately convergent to As an epidemic progresses, we will update the intervals with newly available information during and as i approaches n then those intervals nearby to n will have less of a chance of evolution or less chance of up-gradation (due to a truncation effect). Once the reported numbers are complete, we will similarly study the recovery stage of an epidemic and hence collect the data to compute Fig. 4.6. Accumulation of old cases in new intervals distributed back to the respective time intervals is schematically described in Fig. 4.7. This procedure will update the reported cases in the past as long as at least one reported case observed in the present time interval belongs to one of the past time intervals. Based on the location of the reported cases the graphs constructed will be updated as well. Hence for each present time interval that we observe a reported case that belongs to one of the past time intervals, the fractional wavelets will become graphically closer to the complete (or true) wavelet for that time interval.
5. Numerical examples
To demonstrate our method, we have constructed two numerical toy examples. We have generated non-uniform random positive integers of ai values for The ai values generated are in Table 1 .
Table 1.
A fraction of ai for i > 1 was assumed to have occurred prior to the ith time interval but are reported in the ith time interval, and we have adjusted excess values of ai as per the formula below:
(5.1) |
where is the number of reported cases in the ith time interval which belonged to the ith time interval and for j < i is the reported cases in the ith time interval which belonged to one or more of the jth time interval for Using the below formula (5.1), we will retrospectively adjust a 2, values by using below equations:
(5.2) |
The values of reported cases in the expression in (5.2) in the ith time interval are numerically computed using the set of equations in (5.3) (Table 2 )
(5.3) |
Example 4
In (5.3), and ϕ 22 is a random fraction in [0.6,0.95], i.e. among the reported cases in the 2nd time interval 60 % to 95 % belong to the same time interval and fraction belong to the 1st time interval. The interval [0.6,0.95] is arbitrary and it can be changed to a different range of fractions. Again, assume and ϕ 33 is picked randomly to take a value within the interval [0.6,0.95]. Once ϕ 33 is fixed, ϕ 31 is assumed to take a value that is picked randomly from the interval . The number ϕ 32 is obtained as
Table 2.
In a similar way, is assumed to be 1 and ϕii is picked randomly to take a value within the interval [0.6,0.95]. Once ϕii is fixed, ϕ i1 is assumed to take a value that is picked randomly from the interval ϕ i2 is assumed to take a value that is picked randomly from the interval and so on is assumed to take a value that is picked randomly from the interval
Finally, is obtained as All the random values picked are expanded to two decimal places. In case due to rounding of numbers to nearest integer causes a situation such that then the excess value is discounted from the value already computed for aiϕii to make If the already-computed value for then the discount is done from aiϕ i1, and if then discounting will be done from aiϕ i2, and it will be repeated until the discounting of excess value is done (Fig. 5.1 ).
Using the ϕ values and expressions in (5.3), we partition reported values in each interval into cases: those belong to the same interval and those belong to previous intervals. See Table 3 .
Example 5
We have constructed our second toy example by considering a different interval [0.5,0.8] in which to pick ϕii values for . We have repeated the procedure explained in Example 4 to obtain the ϕ value table, partition values of ai, and adjusted values of ai. The results are shown in Table 4 .
Meyer wavelet of various order were computed based on the number of cases reported as a proportion of the total adjusted cases in Example 2, Table 4. These are in Fig. 5.2 . Mayer wavelets are smooth and rapidly decreasing. The basis made up of Meyer wavelets is unconditional. The wavelets act robustly in our situation and hence the pictorial differences between the three wavelets of Fig. 5.2 are minimal. Yet the differences in patterns is definitely present because of the order of the various wavelets. The ai values are taken from each of the numerical examples or a real epidemic data collected. They can be combined with the network of people who are connected with each of the disease cases to construct graphs for the interval i for We are proposing that such graphs will help in assisting measures of controlling the spread.
Table 3.
Cases belonged to ti below means the time interval beginning from . | ||||||||||||
Reported cases | t0 | t1 | t2 | t3 | t4 | t5 | t6 | t7 | t8 | t9 | t10 | t11 |
in ith interval | ||||||||||||
83 | ||||||||||||
17 | 47 | |||||||||||
23 | 1 | 63 | ||||||||||
4 | 0 | 1 | 80 | |||||||||
0 | 6 | 0 | 1 | 83 | ||||||||
2 | 4 | 1 | 0 | 0 | 53 | |||||||
3 | 9 | 2 | 0 | 0 | 0 | 48 | ||||||
10 | 2 | 8 | 0 | 0 | 0 | 0 | 35 | |||||
5 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 33 | ||||
2 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 16 | |||
8 | 3 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 32 | ||
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 28 |
Table 4.
Example 1 data | ||||||||
Example 2 data | ||||||||
6. Discussion
We provide a group of epidemic growth scenarios inspired by the harmonic analysis set-up. A couple of scenarios from this could represent true epidemic growth curves. With this, we will be in a position to assess the level of under-reporting in a particular epidemic. So the strategy we are proposing could be beneficial in not only building a true epidemic but also assessing the level of reporting error in an epidemic. We provide the true epidemic with noise to illustrate our claim (Table 5 ).
Table 5.
In addition to the gain in construction of a true epidemic, we also propose new methods that blend harmonic analysis with dynamical systems and sampling strategy. In this way, harmonic analysis is used to bridge the gap between unknown and known information in disease epidemiology. Suppose we determine one of the fractional wavelets, say (Ψa(t), Φa(t)), is closest to the true wavelet (conceptually based on actual disease cases in the population) ; then finding a measure which is the difference of plots between two wavelets (Ψa(t), Φa(t)) and (Ψ(t), Φ(t)) will complete the mapping of the epidemic at time t. However determining which one of the fractions (Ψi(t), Φi(t)) is closest to the true one is not so easy. But if there are no significant multiple reporting of disease cases then the largest fractional wavelet could be assumed to be the one with shortest measure (Table 6 ).
Table 6.
Cases belonged to ti below means the time interval beginning from . | ||||||||||||
Reported cases | t0 | t1 | t2 | t3 | t4 | t5 | t6 | t7 | t8 | t9 | t10 | t11 |
in hith interval | ||||||||||||
83 | ||||||||||||
19 | 45 | |||||||||||
22 | 17 | 50 | ||||||||||
24 | 6 | 2 | 53 | |||||||||
4 | 1 | 10 | 16 | 59 | ||||||||
17 | 2 | 1 | 0 | 0 | 40 | |||||||
29 | 1 | 0 | 1 | 1 | 1 | 31 | ||||||
0 | 6 | 4 | 1 | 0 | 0 | 1 | 43 | |||||
12 | 2 | 1 | 0 | 0 | 0 | 0 | 1 | 24 | ||||
3 | 2 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 12 | |||
14 | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 28 | ||
2 | 3 | 6 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 18 |
We have argued how various combinations of partial data can be used for discrete constructions which in turn form a piece of supporting information to construct wavelets. These two aspects make our proposed work very innovative. In summary, what we are trying to develop through this paper is, given that we have partial data of an event (here we mean event of reporting of disease cases), we will construct the complete event data. Wavelets, in this work, are occupying a key role in the processing of built-up or accumulated data to build complete event data. The event here is the reported number of cases in a time interval and these reported cases represent only a partial number of actual epidemic cases. Through this paper, we demonstrated a method of improving partial data to be close to data that could be complete. How we plan to update our data reported in an interval and bring it closer to the actual number of disease cases is described in this paper.
This exploration will assist in better visualization of any emerging epidemic spread in a more realistic sense. We also believe that our methods could provide additional tools for those epidemic modelers who frequently use modeling tools such as ODE and PDE, to begin with. We are not only looking for academic development through this project but, also a clear non-trivial body of techniques for applications of harmonic analysis that has never been seen before in terms of developing epidemic analysis. We plan to come up with some interesting insights on how to construct wavelets for medical applications. The bottom line is that we will be able to help public health planners for better management and courses of action during emerging epidemics. The kind of analysis we present here to fill missing pieces of epidemic reporting information can be applied to other areas, for example, constructing total rhythm of a heartbeat from partial information, etc.
We have identified a gap in the methods of understanding true epidemic growth and spread and tried to address this lacuna by proposing a novel method. We are proposing through this study that wavelets could offer a road map closer to finding a practical solution (we are aware that a perfect solution is impossible by any method because some of the disease cases in any situation are never reported). Technical aspects of the storyline depending on the construction of discrete graphs and fractional wavelets. Fractional wavelets are newly introduced to the literature through this study. A solution through wavelets is also not trivial because there is no ready-made set of wavelets available which will offer a timely road-map. So we have introduced a novel strategy. Hence we argue that our approach will help us to come closer to our aims of understanding epidemics in a more accurate and timely fashion. As a bi-product, we can develop techniques for data scientists to analyze disease surveillance.
7. Questions still remain
Within what period we can generate a true epidemic from its emergence using the harmonic analysis set-up? Can we predict the full picture of an epidemic from only partial data? Can we measure the validity and the accuracy of an epidemic growth curve? Can we conduct a timely analysis of the preliminary data using the proposed techniques ?
Funding
None.
CRediT authorship contribution statement
Steven G. Krantz: Conceptualization, Methodology, Validation, Investigation. Peter Polyakov: Writing - review & editing. Arni S.R. Srinivasa Rao: Conceptualization, Data curation, Formal analysis, Methodology, Validation, Investigation, Project administration, Software, Supervision, Visualization, Writing - original draft.
Declaration of Competing Interest
None.
Acknowledgements
We thank both the referees and the handling editor for valuable comments which helped us to revise our manuscript thoroughly.
Footnotes
(2.1) |
Contributor Information
Steven G. Krantz, Email: sk@math.wustl.edu.
Peter Polyakov, Email: polyakov@uwyo.edu.
Arni S.R. Srinivasa Rao, Email: arrao@augusta.edu.
References
- Anderson R.M., Grenfell B.T., May R.M. Oscillatory fluctuations in the incidence of infectious disease and the impact of vaccination: time series analysis. J. Hyg. 1984;93:587–608. doi: 10.1017/s0022172400065177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Anderson R.M., May R.M. Oxford Univ. Press; Oxford: 1991. Infectious Diseases of Humans: Dynamics and Control. [Google Scholar]
- Bartlett M.S. Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 4: Contributions to Biology and Problems of Health, 81–109, 1956. University of California Press; Berkeley, Calif: 1956. Deterministic and tochastic odels for ecurrent pidemics. [Google Scholar]; https://projecteuclid.org/euclid.bsmsp/1200502549.
- Bartlett M.S. Measles periodicity and community size. J. R. Stat. Soc. A. 1957;120(1957):48–70. [Google Scholar]
- Bauch C.T. Vol. 1945. Springer Verlag; 2008. The role of mathematical models in explaining recurrent outbreaks of infectious childhood diseases’. in: ‘mathematical epidemiology. (Lecture Notes in Mathematics, Mathematical Biosciences Subseries). [Google Scholar]
- Bauch C.T., Earn D.J.D. Proceedings of the Royal Society of London B. Vol. 270. 2003. Transients and attractors in epidemics; pp. 1573–1578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bauch C.T., Earn D.J.D. Interepidemic intervals in forced and unforced SEIR models. In: Ruan S., Wolkowicz G., Wu J., editors. Dynamical Systems and Their Applications in Biology. Vol. 36. Fields Institute Communications; 2003. pp. 33–44. [Google Scholar]
- Cazelles B., Cazelles K., Chavez M. Wavelet analysis in ecology and epidemiology: impact of statistical tests. J. R. Soc. Interface. 2014;11:20130585. doi: 10.1098/rsif.2013.0585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Earn D., Rohani P., Grenfell B.T. Spatial dynamics and persistence in ecology and epidemiology. Proc. R. Soc. Lond. B. 1998;265:7–10. doi: 10.1098/rspb.1998.0256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grenfell B.T., Bjø rnstad O.N., Kappey J. Travelling waves and spatial hierarchies in measles epidemics. Nature. 2001;414:716–723. doi: 10.1038/414716a. [DOI] [PubMed] [Google Scholar]
- Grenfell B.T., Bolker B.M., Kleczkowski A. Seasonality and extinction in chaotic metapopulations. Proc. R. Soc. Lond. B. 1995;259:97–103. [Google Scholar]
- Hernández E., Weiss G. CRC Press; LLC: 1996. A First Course on Wavelets. [Google Scholar]
- Krantz S.G. Vol. 27. Mathematical Association of America; Washington: 1999. A Panorama of Harmonic Analysis. (Carus Mathematical Monographs). [Google Scholar]
- Krantz, S. G., 2019. Fourier analysis and differential equations with wavelets and applications. Book in preparation.
- Labate D., Weiss G., Wilson E. Wavelets. Notices Amer. Math. Soc. 2013;60(2013):66–76. [Google Scholar]
- Meyer Y. Vol. 9. American Mathematical Society; Providence, RI: 1998. Wavelets, Vibrations and Scalings. With a preface in French by the author. (CRM Monograph Series). [Google Scholar]
- Meyer Y., Ryan R.D. Society for Industrial and Applied Mathematics; Philadelphia: 1993. Wavelets: Algorithms and Applications. [Google Scholar]
- Rao A.S.R.S. Understanding theoretically the impact of reporting of disease cases in epidemiology. J. Theoret. Biol. 2012;302(2012):89–95. doi: 10.1016/j.jtbi.2012.02.026. [DOI] [PubMed] [Google Scholar]
- Soper H.E. The interpretation of periodicity in disease prevalence. J. Roy. Stat. Soc, Ser. A. 1929;92:34–61. [Google Scholar]
- Strichartz R.S. How to make wavelets. Am. Math. Mon. 1993;100(6):539–556. [Google Scholar]
- Walker J.S. Fourier analysis and wavelet analysis. Notices Amer. Math. Soc. 1997;44(6):658–670. [Google Scholar]