Abstract
Simple Summary
This article aims to study the times series provided by data of the daily number of reported cases of COVID-19. During the COVID-19 pandemic, most people viewed the oscillations around the exponential growth at the beginning of an epidemic wave as the default in reporting the data. The residual is probably partly due to the reporting data process (random noise). Nevertheless, a significant remaining part of such oscillations could be connected to the infection dynamic at the level of a single average patient. Eventually, the central question we try to address here is: Is there some hidden information in the signal around the exponential tendency for COVID-19 data?
Abstract
Background: The age of infection plays an important role in assessing an individual’s daily level of contagiousness, quantified by the daily reproduction number. Then, we derive an autoregressive moving average model from a daily discrete-time epidemic model based on a difference equation involving the age of infection. Novelty: The article’s main idea is to use a part of the spectrum associated with this difference equation to describe the data and the model. Results: We present some results of the parameters’ identification of the model when all the eigenvalues are known. This method was applied to Japan’s third epidemic wave of COVID-19 fails to preserve the positivity of daily reproduction. This problem forced us to develop an original truncated spectral method applied to Japanese data. We start by considering ten days and extend our analysis to one month. Conclusion: We can identify the shape for a daily reproduction numbers curve throughout the contagion period using only a few eigenvalues to fit the data.
Keywords: epidemic models, time series, spectral method, spectral truncation method, phenomenological models
1. Introduction
Modeling an epidemic peak requires precise knowledge of the daily data corresponding to new cases. One of the aims of the paper is to extract the value of the average daily reproduction numbers. The daily reproduction numbers vary from individual to individual and from day to day during the period of contagiousness of an individual. These numbers depend on the age of infection, i.e., the number of days since the individual contracted the infectious disease.
From a discrete model of the evolution of new daily cases, we propose to evaluate the average number of secondary infected individuals produced by a single infected individual on each day d since his infection. For this purpose, on the top of the dominant eigenvalue, we will estimate from the data other significant subdominant eigenvalues (complex), which explain the modulation of the growth and allow better adequacy of the model to the data.
For that purpose, we reconsider the discrete-time epidemic model with the age of infection presented in Demongeot et al. [1]. This model is a discrete-time version of the Volterra integral formulation of the Kermack–McKendrick model with age of infection [2]. The variation of the number of susceptible individuals is given each day by
| (1) |
where is the number of susceptible individuals at time t, and is the daily number of new infected at time t. Throughout the paper, we use the following convention for the sum
As a consequence, when , (1) gives
We assume for simplicity that the epidemic starts from a single cohort of infected at time , then the number of infectious individuals is given by
| (2) |
where is the number of infected individuals at time , and is the probability for an infected to be infectious after d day of infection. In particular, we have .
We assume that the number of new infected at time t is the product of the transmission rate with the number of susceptible individuals and the number of infectious at time t. That is,
| (3) |
By replacing by the right hand side of (2) in (3), we obtain
| (4) |
Now assuming that and are constant (over a short period of time), then we define the daily reproduction numbers as
The quantity is the average number of secondary infected produced by a single infected on the day d since infection (see [1] for more details). Therefore, the basic reproduction number is the following quantity
| (5) |
where n is the maximal duration (in days) of the infection.
Moreover, when and are constant, Equation (4) becomes a linear discrete time Volterra integral equation
| (6) |
where (I) is the number of infected produced directly by the infected individuals already present on day , and (II) is the number of new infected individuals at time t produced by the new infected individuals since day .
If we consider the first terms of the discrete time Volterra Equation (6), we obtain
In practice, we can assume that since infected individuals are not infectious immediately after being infected. Under this additional assumption, we obtain the system
Therefore, (6) can be rewritten as a scalar delay difference equation
| (7) |
Assume that the infectious period is n days. That is
Then by defining , Equation (6) becomes
| (8) |
with the initial values
| (9) |
The goal of this article is to understand how to identify the daily reproduction numbers in (8) knowing on some finite time interval. This problem is particularly important to derive the average dynamic of infection at the level of a single patient.
One of the aims of this paper is to investigate the variations of the daily reproduction number during the period of contagiousness of infectious individuals.
This is not the case in influenza, as shown in simulated data [3] and in real infected animals, where we observe a U-shaped evolution of their viral load and symptoms as their body temperature during their contagiousness period. From there, it is possible to suspect a U-shaped variation in their ability to emit (aerosol transmission) the virus and, therefore, to contaminate it [4].
After the first asymptomatic period (without contagiousness), the daily reproduction number increases. After one to three days, this number decreases due to the action of the first defense of the innate immune system. But, the virus passes over this first immune defense, and the daily reproduction number increases again before the action of the second adaptive immune system. Then, after two to four days, the second adaptive immune response becomes fully effective. The combination of these biological mechanisms causes the daily reproduction numbers’ U- or M-shaped curve.
The literature about parameters identification for epidemic models with age of infection can be divided into two groups of articles depending on the assumptions made. The first group assumes that is a given function and estimates the time dependent transmission rate . As a consequence, they obtain the instantaneous (daily or effective) reproduction number, which is
We refer to [5,6,7,8,9,10,11,12] (and references therein) for more results about this subject.
The second group corresponds to the assumptions considered here. That is, we assume that and are constant functions (over a short period of time) and estimate the daily reproduction number. That is the case for the discrete time model in [13] and more recently for the continuous time model in [1]. The major default in [13] is that the estimated does not remain positive. We will have the same problem is Section 3.1 when we will use the full spectrum. In Section 3.2, to solve this problem, we introduce a method using the dominant and secondary eigenvalue only.
This article aims to investigate the shape of the distribution from the data of COVID-19. In Figure 1, we illustrate the notion of U- or M-shaped distribution.
Figure 1.
In this figure, we illustrate the notion of U shape distribution in (a) and M shape distribution in (b). Recall that represents the ability of patients to transmit the pathogen after d days since they were infected. The U shape or M shape distribution means that patients can transmit the pathogen since the beginning of their infection. Then they become less infectious in the middle of the infected period. Finally, they become infectious again at the end of the infected period. The only difference between U and M shape distribution is to include days 0 and 8 and in the plot.
The U and M shape distribution are well known in the context of influenza [3,4]. In Figure 2, we present some figures reflecting patients’ viral load for COVID-19.
Figure 2.
Viral load in COVID-19 real patients [14]. In (a), the red curve corresponds to the throat swab and the blue curve corresponds to the sputum. In (b), the curves correspond to several patients (A), (B), and (C).
Such U shape has not yet been systematically studied in COVID-19 data, but observations of the evolution of the viral load have been done in some patients and show this U shape. Figure 2 shows such a U-shaped evolution for the viral load in real patients [14].
The present work is directly connected to the original work of Peter Whittle in 1951 [15,16], who introduced the Auto Regressive Moving Average (ARMA) model, after the seminal paper on time series by N. Wiener [17],
| (10) |
where is the size at time t of the population whose growth is forecasted, the kernel has real values, n is the regression order, and here stands for a noise. Equation (10) has been extensively studied under the denomination of ARMA models by many authors [18,19,20,21,22,23,24].
Here, we propose a new approach based on the spectral properties of the population growth equation to capture information from data. Our goal is to estimate the shape of the daily reproduction numbers . Spectral methods are not new (see Priestley [20,25]), but it usually refers to Fourier transform with frequencies associated to various periods, corresponding to a fundamental period and its sub-multiples (harmonics). If we consider the auto regressive part only, the spectrum of the delay difference equation is determined by its characteristic equation
The main idea in this article is to use these eigenvalues (i.e., the solution of the characteristic equation) to identify the parameters . The eigenvalues are estimated by some separated method. In Section 2, we will see that when all the eigenvalues are non null and separated two by two, then we can compute the parameters by using the eigenvalues only.
The idea of using eigenvalues in population dynamics goes back to Malthus [26], who, in 1798, first identified in a mixture of populations the one that would impose itself on the others, determined through the exponential growth of the largest exponent—this leading exponent having been called Malthusian parameter by Fisher [27]. The Malthusian growth seeming unrealistic, the saturation logistic term was introduced further by Lambert [28], and then extending the initial work by Euler [29], Lotka [30], Leslie [31], and Hahn [32] gave the current matrix form of the discrete population growth equations.
However, as far as we know, estimating the subdominant eigenvalues to characterize the system is new. So the key idea of this work is to use the dominant eigenvalue and also the following pair of complex conjugated eigenvalues as an estimator to reconstruct the kernel of the auto regressive part.
This work is motivated by the times series provided by data of the daily numbers of reported cases of COVID-19. During the COVID-19 pandemic, most people viewed the oscillations around the exponential growth at the beginning of an epidemic wave as the default in reporting the data. The residual is probably partly due to the reporting data process (random noise). Nevertheless, a significant remaining part of such oscillations could be connected to the infection dynamic at the level of a single average patient. Eventually, the central question we try to address here is: Is there some hidden information in the signal around the exponential tendency for COVID-19 data? So we consider the early stage of an epidemic phase, and we try to exploit the oscillations around the tendency in order to reconstruct the infection dynamic at the level of a single average patient.
We start by investigating the connection between a signal decomposed into a sum of damped or amplified oscillations and a renewal equation. The prototype example we have in mind is the following:
where , , , and .
In Figure 3, we illustrate a growing function with damped oscillations (i.e., ) and amplified oscillations (i.e., ). It is clear from Figure 3 that a periodic function can not represent such a signal, and extending such a signal by periodicity would be artificial. Indeed, the Fourier decomposition would only provide purely imaginary eigenvalues that would exclude a continuation of the exponential growth (i.e., eigenvalues with non-zero real parts). To apply wavelets theory (see, for example, in [33]), we need to extend the data for negative times by symmetry with respect to the initial time , and we need a decreasing function ( and ).
Figure 3.
We plot an exponentially growing function with (a) damped oscillations and (b) amplified oscillations.
Here, we are more interested in the model resulting from the data (i.e., , ) than in the fit to the data. The major problem with the Fourier method is that this method provides only eigenvalues with zero real parts (that is due to the periodicity required for this method). Such eigenvalues are well adapted to a periodic signal, but this is not suitable to describe, for example, an ever-growing function (as in Figure 3). Consequently, the Fourier method is not well adapted to derive non-negative daily reproduction numbers (i.e., ).
Previous analogous approaches can be found in the seismic data modeling and statistical literature, like the Wiener–Levinson predictive deconvolution (Robinson [34], Peacock and Treitel [35], Robinson and Treitel [36]), which intends to estimate the minimum phase wavelet in the data, in particular in the case where the relatively weak sampling does not make it possible to affirm the Gaussian character of the errors (Walden and Hosken [37]). If the Gaussian character of the errors can be proven, another similar approach is that of the Geometric Brownian Motion (GBM) processes (Vinod et al. [38]) used, for example, in the analysis of financial data (Ritschel et al. [39]), which are based on the model of the solution of a stochastic differential equation, multiplied by a periodic component with a Gaussian noise.
The structure of this paper is as follows: Section 2 is devoted to the materials and methods. We recall some notions of matrices and spectra. We also present some phenomenological models that will be compared to the data. Section 3 contains the results. We fit the phenomenological models to the cumulative numbers of reported cases in Japan over 10 days and 30 days. We use the eigenvalues derived from the phenomenological model, and we identify the daily reproduction numbers by using: (1) all the spectrum (see Appendix B) and (2) part of the spectrum. The last section of the paper is devoted to the discussion and the conclusion. We present in the Appendices all the mathematical aspects of the paper (see Appendix A, Appendix B, Appendix C and Appendix D).
2. Materials and Methods
2.1. Identification of the Model
The Leslie matrix associated to the difference Equation (8) is
![]() |
(11) |
The characteristic equation of (11) is
| (12) |
for , which is equivalent to (whenever )
| (13) |
The complex numbers satisfying the characteristic equation are called the eigenvalues of L.
In Appendix A and Appendix B, we discuss the identification problem of the daily reproduction numbers by using the eigenvalues of L. The main identification result of Appendix B corresponds to the formula (A3).
Definition 1.
We will say that L is a Markovian Leslie matrix if all the values are non negative, and
2.2. Phenomenological Model to Fit the Cumulative and the Daily Numbers of Reported Case Data
Due to Lemma A1 below, we propose the following phenomenological model to represent the data
| (14) |
where are non null, are pairwise distinct, and .
Remark 1.
In the above formula, we allow the constant terms whenever.
Assuming that the unit of time is one day, we have the following relationship between the cumulative number of cases and the daily number of cases
We deduce that the daily number of reported cases has the following form
where are non null, and are the same as in (14), and .
Since is obtained from by computing the first derivative, we have the following relationship
Remark 2.
For the daily number of cases dataonly a few eigenvalues will be tractable. For example, in Section 3.3, we will consider the following extension
wherewill containmerged together with some random term.
Remark 3.
The identification of the eigenvaluesas parameters of the phenomenological model is discussed in Section 3.3. So far, this problem for a finite time interval seems to be open.
We will first approach the data with the following phenomenological model.
| Phenomenological model for the cumulative numbers of reported cases with | ||||||
We start with a first eigenvalue , for some . The phenomenological model used to fit the cumulative numbers of reported cases has the following form
For discrete times, it is equivalent to say that
By computing the first derivative of , we obtain a model for the daily number of cases of the following form
|
Once the best fit of the above phenomenological model to the data is obtained, we can subtract this model to the data , then we obtain a first residual
Next we will approach the residual with the following phenomenological model.
| Phenomenological model for the cumulative numbers of reported cases with | ||||||||
Assume that the eigenvalues are two conjugated complex numbers , for some and . The phenomenological model used to fit the cumulative numbers of reported cases has the following form
For discrete times, it is equivalent to say that
By computing the first derivative of , we obtain a model for the daily number of cases of the following form
|
Remark 4.
2.3. Cumulative and Daily Number of Reported Cases for COVID-19 in Japan
Here we use cumulative numbers of reported cases for COVID-19 in Japan taken from the WHO [40]. The data show a succession of epidemic waves (blue background color regions) followed by endemic periods (yellow background color regions). In Figure 4, black dots represent the data. The blue background color regions correspond to epidemic phases, and the yellow background color region to endemic phases. The region of interest to apply the method is between 19 October and 29 October 2020. This region is marked with light green vertical lines in the figure.
Figure 4.
In this figure, we plot the daily number of reported cases for COVID-19 in Japan.
3. Results
3.1. Methods Applied to Ten Days Data
In this section, we will fit the phenomenological model (15) or (18) to the cumulative numbers of reported cases presented in the previous subsection. We consider a period of 10 days since the beginning of the third epidemic wave of COVID-19 in Japan. The period goes from 19 to 29 October 2020.
-
Step 1:
In Figure 5, we fit an exponential function (15) to the cumulative number of reported cases of COVID-19 in Japan between 19 and 29 October 2020.
Figure 5.
In this figure, the black dots correspond to the cumulative numbers of reported cases of COVID-19 in Japan between 19 October and 29 October 2020 (black dots). The red curve corresponds to the best fit of model (15) to the cumulative numbers of reported cases.
In Figure 5, the best fit of model (15) is obtained for
Hence,
-
Step 2:Next, we consider the residual left after the previous fit,
In Figure 6, we fit the model (18) to the first residual function .
Figure 6.
In this figure, the black dots correspond to the function from 19 October and 29 October 2020 (black dots). The red curve corresponds to the best fit of model (18) to .
In Figure 6, the best fit of model (18) (i.e., minimizing the sum-of-squares error) is obtained for
The period associated to is equal to days. This periodic phenomenon was observed in many countries (see for example [41]). Here,
By using
in (A3) below, with , we obtain
| (22) |
Since
therefore, the components of are not too large, and the above result should not be too sensitive to the stochastic errors. The main problem in (22) is the second component , which is not making sense in this context.
3.2. Spectral Truncation Method Applied to Ten Days Data
In the previous subsection, the first two fits make perfect sense. However, adding more fits would be questionable because they become more and more random after a few steps. We could alternatively continue to fit the rest by using our phenomenological model, which would provide new eigenvalues.
The major problem in the previous section is that when we apply formula (A3) with all the eigenvalues, we obtain some with negative values. Instead here, we increase the dimension n of L, and we use only the eigenvalues .
3.2.1. Re-Normalizing Procedure
Assume that then by
where is a solution of (8), we obtain the following normalized equation
and by dividing the above equation by we obtain
where
| (23) |
By using the procedure, we can always fix the dominant eigenvalue of L to 1 by imposing that L is Markovian (see Definition 1). Then we use the following re-normalizing procedure for the eigenvalues
In Figure 7, we fit these eigenvalues and with the spectrum of Markovian Leslie matrices L on a mesh. We observe that the fit improves when the dimension of L increases.
Figure 7.
We plot the spectrum of the Markovian Leslie matrices L (red dots) when (respectively in (a–d)) giving the best match to the secondary eigenvalues and (green dots). We observe that the best fit of the two secondary eigenvalues remain far away from and for , then get closer for , and are very close for and .
In Figure 8, we observe that, for , there is a unique set of eigenvalues of L (classified with decreasing real part) minimizing the distance and . This is no longer true for .
Figure 8.
We plot the spectrum of the Leslie matrix L (red dots) when (respectively in (a–d)) giving the best match to the secondary eigenvalues and (green dots). The red dots correspond to the spectrum of L for all the possible matrices L, having their second pair of eigenvalues close to the minimal distance to and .
3.2.2. Daily Basic Reproduction Numbers
In Figure 9, we plot the average distribution , standard deviation (blue region), and confidence interval.
Figure 9.
In this figure, we use the distributions minimizing the distance and whenever . In (a), we plot the average distribution (red curve), standard deviation (blue region), and confidence interval (light blue region). In (b), we plot the 24 distributions . In (c), we give a histogram with the multiple values of . We observe that some of the are similar to the case , with a maximum on day , but on average the maximum value is on day 7.
In Figure 10, we plot the daily basic reproduction numbers .
Figure 10.
We plot the daily basic reproduction numbers obtained for in (a), in (b), in (c), and in (d). The distribution for corresponds to the red curve in Figure 9.
We can notice that following [42], the effective is between and on 19 October 2020, in Japan.
3.2.3. Applying the Model to Daily Number of Reported Cases
The model used to run the simulations is the following
| (24) |
and according to the formula (17) and (20), with the initial condition
| (25) |
with
| (26) |
In (24)–(26) we use the parameter values estimated in Section 3.1.
In Figure 11, we plot the daily number of reported cases data from October 19 to November 19, 2020 (black dots) and from model (24) and (25) with the values of obtained in Figure 10c (red dots).
Figure 11.
In this figure, we plot the daily number of reported cases data from October 19 and November 19, 2020 (black dots) and from model (24) and (25) with the values of obtained in Figure 10c (red dots).
3.3. Extension of the Spectral Truncation Method over One Month
In Figure 12, we apply respectively the AutoCorrelation Function (ACF) and Partial AutoCorrelation Function (PACF) to the daily number of cases for Japan from 19 October and 19 November 2020. It does not look like any standard cases. In the ACF, we observe the correlation is significant until 7 days, while in the PACF it is until 16 days.
Figure 12.
Autocorrelation Function (ACF) (left hand side) and Partial Autocorrelation Function (PACF) (right hand side) applied to the daily number of cases for Japan between 19 October and 19 November 2020.
- Step 1: In Figure 13, we fit the model
with the cumulative number of reported cases data between 19 October and 19 November 2020.(27)
Figure 13.
In this figure, we plot the cumulative number of reported cases data between 19 October and 19 November 2020 (black dots). We plot the best fit of the model (27) to the cumulative data (red curve).
We obtain the following parameter values for the best fit
| (28) |
- Step 2: Next we define as before the first residual
and we fit the with the model(29) (30)
In Figure 14, we plot the cumulative number of reported cases data between 19 October and 19 November 2020.
Figure 14.
In this figure, we plot the cumulative number of reported cases data between 19 October and 19 November 2020 (black dots). We plot the best fit of the model (30) to the cumulative data (red curve).
The parameters of the phenomenological model obtained for the best fit are the following
| (31) |
and
| (32) |
The periods associated to and are, respectively,
These periods are close multiples of 7 days.
Remark 5.
It is important to note that the periodof 21 days is difficult to explain mechanically, but this value is the smallest value giving the best fit to the data. We tried to impose some upper bounds smaller than 21 days. In such a case,is always replaced by the upper bound. This is true for all constraints less that 21 days, and for each constraint larger than 22 days, we obtaindays.
Remark 6.
It is important to note that . That is because, during the fit, we impose that and . That is the condition coming from the Perron Frobenius theorem, in order to obtain
This condition is coming from the fact that must be the spectral radius of L and belong to the circle centered at 0 and with the radius equal to the spectral radius of L (i.e., with a modulus less or equal to ).
Eigenvalues associated to the model and : The first eigenvalue is
The second pair of complex conjugated eigenvalues is
and
and the modulus of is
The fourth eigenvalue is
and the fifth eigenvalue is its conjugate
and the modulus of is
Using and as an estimator: Next we consider all the matrices L in which the component is replaced by , and we assume that
The dominant eigenvalue of L is 1, and we look for matrices such that the second eigenvalue of L is close to
and the fourth eigenvalue of L is close to
For realizing this approach, we minimize the
where
where is the set of all eigenvalues of L.
In Figure 15, we consider the such that the corresponding maximum satisfies
Figure 15.
In this figure, we consider the case . We plot the distributions of daily basic reproduction numbers corresponding to the distributions having some secondary eigenvalues and fourth eigenvalues at a distance less than to the best match. The red curve is the average distribution . The blue region corresponds to the standard deviation around the mean distribution. The light blue region corresponds to the confidence interval.
We define
| (33) |
In Figure 16, we obtain a good description of the dynamic of infection at the individual level that confirms the one obtained over shorter periods. As expected, the average patient first loses its ability to transmit the pathogen, and after decreasing by day 1 to day 4, increases between day 4 and day 7. Day 7 is a maximum. After day 7, decays until day 9. Then a second peak arises, with a maximum on the day 14. We could explain this second peak by supposing that an important transmission of pathogen still exists from day 12 to day 16. We also obtain a third from day 19 to 23 with a maximum value on day 21.
Figure 16.
In this figure, we consider the case . We plot the distributions of daily basic reproduction numbers , where is the red curve in Figure 15.
In Figure 17, we plot the spectrum of the Leslie matrix L when corresponds to the average distribution (i.e., the red curve in Figure 15).
Figure 17.
In this figure, we consider the case . We plot the spectrum of the Leslie matrix L (red dots) when corresponds to the average distribution (i.e., the red curve in Figure 15).
Recalling that, by definition, the basic reproduction number is
we obtain the sum of the daily reproduction numbers (red curve in Figure 16)
In Figure 18, we plot a histogram for the values of the basic reproduction number obtained by summing the distributions from Figure 16.
Figure 18.
In this figure, we consider the case , and we plot a histogram for the values of the basic reproduction number obtained by summing the distributions from Figure 16.
Next, we consider
| (34) |
and accordingly to the formula (17) and (20), with the initial condition for , we have
| (35) |
with
| (36) |
In (24)–(26) we use the parameter values estimated in Section 3.1.
In Figure 19, we see the mean distribution permits to produce oscillations around the tendency for the daily number of cases. It is important to note that without the third peak in Figure 16 we do not obtain such a good correspondence between the model and the data.
Figure 19.
In this figure, we plot the daily number of reported cases data between 19 October and 19 November 2020 (black dots). The red curve corresponds to , and the green dots correspond (34) and (35) whenever comes from the average distribution (i.e., the red curve in Figure 15). We observe a very good match between the green dots and the red curve (the phenomenological model).
4. Discussion
In this article, we start by investigating the connection between a signal decomposed into a sum of damped or amplified oscillations and a renewal equation. Namely, we connect the daily number of reported cases written as
with the renewal equation
In the context of epidemic time series, a spectral method usually refers to the Fourier decomposition of a periodic signal. In the present paper, the data are not periodic and are composed of an exponential function (Malthusian growth) perturbed with some damped oscillating functions. So we use complex numbers with non-null real parts. We refer to Cazelles et al. [33] for more results about time series.
4.1. Data over Ten Days
We can notice in Figure 9 and Figure 10 and Table 1 that the daily reproduction number as well as the instantaneous reproduction number are estimated. Concerning the instantaneous (or effective) reproduction number [43,44] estimated by [42], which equals 1.1 at the 19th of October 2020, the best fit corresponds to days (see (c) in Figure 9). This value of the duration of the contagiousness period is close to the values 6 or 7 days and are close to the values estimated from the virulence measured in [14,45,46]. In Figure 10, we always obtain a -shaped distribution for the curve of daily reproduction numbers. This corresponds to the biphasic form of the virulence already observed in respiratory viruses, such as influenza, as recalled in the Introduction.
Table 1.
The above reproduction numbers are obtained by using the formula .
| n | 3 | 5 | 6 | 7 |
|---|---|---|---|---|
| 1.02 | 1.04 | 1.06 | 1.07 |
This temporal behavior of the contagiousness can correspond to the evolution of contagious symptoms like cough or spitting, which diminish during the innate immune response, followed by a comeback of the symptoms before the adaptive immune response (whenever the innate defense has been overcome by the virus). If the innate cellular immunity has been not sufficient for eliminating the virus, the viral load again increases, causing a reappearance of the symptoms before the adaptive immunity (cellular and humoral) occurs, which results in a transient decrease in contagiousness between the two immunologic phases. The medical recommendations are, in the case of U-shaped contagiousness, never to take a transient improvement for a permanent disappearance of the symptoms and to stay at home to avoid a bacterial secondary infection that is possibly fatal.
The estimation of the daily reproduction numbers in the COVID-19 outbreak constitutes an important issue. At the public health level, to publish only the sum of the daily reproduction numbers, that is, to say the basic reproduction number or the effective reproduction number Re, could suffice for controlling and managing the behavior of a whole population with mitigation or vaccination measures. At the individual level, it is important to know the existence of a minimum of the daily reproduction numbers, which generally corresponds to a temporary clinical improvement, after a partial success of the innate immune defense. This makes it possible to advise the patient to continue to respect his own isolation, prevention, and therapy choices (depending on his vaccination state) even if this transient clinical improvement has occurred. The present methodology allows also to estimate both the individual contagiousness duration in a dedicated age class and also its seasonal variations, which is crucial for optimizing the benefit–risk decisions of the public and individual health policies.
4.2. Data over One Month
Over one month, we obtain a daily reproduction number with three peaks. Each peak is centered respectively on 7 days, 14 days, and 21 days. These quantities coincide with the period of 7 days and 21 days obtained in Figure 14 in fitting the first residual when we subtract the exponential growth first fit to the cumulative data. As far as we understand the problem, that is the period of 21 days in the data, which induces the third peak. This third peak is very suspicious. Nevertheless, the data lead us to such a shape for the daily reproductive number. We also tried to run Figure 19 without the third peak, and we obtained a bad fit to the data, while with this third peak, the fit is good. One may also note that the 21-day period is insignificant for the ACF and the PACF in Figure 12.
Several possibilities exist to explain this strange shape for the daily reproduction number using the data over one month. One possible explanation is that the Japanese population should be subdivided into several groups having very different infection dynamics (at the level of a single patient). Here we have in mind the patient with a short infection period but high transmissibility (super spreaders) versus the patient with a long infection period with mild symptoms.
We suspect that such a shape for the daily reproduction number could be attributed to the time since infection to report a case. The daily number of reported cases would be obtained from , and the daily number of new infected cases by using the following model
where the integer is the maximum number of days needed to report a case, is the fraction reported, and is the probability to report a case after d days. Therefore, we must have
4.3. Perspectives and Conclusions
In the present paper, we only consider the Japanese data in the exponential phase of the third epidemic wave.
The case of Japan seems emblematic to us, as it corresponds to a wave of well-identified new cases following a clearly characterized endemic phase. The exponential growth phenomenon being transitory, this explains the relatively limited duration of the sampling, which corresponds to a period in days during which the epidemiological parameters (such as the transmission rate) can be considered as constant. It is in such circumstances where the Gaussian nature of the errors is difficult to prove, due to the small sampling, such that similar methods based on wavelets have been proposed (Walden and Hosken [37]).
The method of the present paper should be applied to several countries for each epidemic wave to obtain a more systematic study. For the moment, over one month, we obtained a shape for the daily reproduction number that follows the data very well. However, we are suspicious about the third peak. We suspect that the default of our analysis is coming from the model itself. Such a question has been recently studied by Ioannidis and his collaborators in [47], and we believe that we are facing such modeling difficulties.
Appendix A. Non Identifiability Result
From Formula (13), we deduce that the characteristic (12) has exactly one positive solution. By the Perron–Frobenius theorem applied to the Leslie matrix L defined by (11), we know that (by considering the norm of linear operator) the spectral radius of L
is the unique positive solution of (12). Moreover, all the remaining eigenvalues have a modulus smaller or equal to . We refer to ([48], Chapter 4), for more results about this subject.
Non identifiability result: Let and . Then
is a known solution of (8) if and only if is a solution of the characteristic equation.
Assume that is given, and satisfies
Then if we define
we deduce that the equation (12) is satisfied for , and is a solution of (8). We conclude that a single function is not enough to identify .
Appendix B. Identifiability Result
Assumption A1.
Assume that are nonzero complex numbers, and are separated two by two. That is,
and
Remark A1.
Since the coefficients of the characteristic Equation (12) are all real, we could also impose that the conjugate of each eigenvalue belongs to the spectrum. That is
However, that is not necessary in this subsection.
Remark A2.
When all the eigenvalues are real, the above assumption will be satisfied if and only if are nonzero real numbers which are pairwise distinct. Up to a permutation, that is
and
Lemma A1.
Let Assumption A1 be satisfied. Assume that each satisfies the characteristic Equation (12). Then the Leslie matrix L defined by (11) is diagonalizable (and invertible); moreover, for each ,
is a solution of (8). That is to say,
Identification of the components from the values of : Assume that the values of are given for . We claim that we can compute . Indeed,
can be rewritten as the system
| (A1) |
The determinant of the above Vandermonde-like matrix
therefore,
Therefore, under Assumption A1, this determinant is non null, and we obtain the following result.
Proposition A1.
Let Assumption A1 be satisfied. Then we can compute the componentsin function of the given elements of the trajectoryby solving the linear system (A1), and
Identification of the component from the : By assuming that each is a solution of the characteristic Equation (12), we obtain
| (A2) |
which rewrites in the matrix form as
Under Assumption A1 the Vandermonde-like matrix
is invertible, because
hence
Therefore, we can compute the component of the map by solving a linear system involving the eigenvalues of the characteristic equation.
Theorem A1.
Let Assumption A1 be satisfied. Then the following properties are equivalent
In Figure A1, we plot all the spectrum’s location for Markovian Leslie matrices on a mesh. We can observe the changes of location of the spectrum depending of the dimension n. It seems that the spectrum is fielding more and more the unit circle in when the dimension increases. We refer to Kirkland [49] for more results going in that direction.
Figure A1.
We plot all the spectrum’s locations for Markovian Leslie matrices on a mesh whenever in (a), in (b), in (c), and in (d). Here the dominant eigenvalue is always 1, and we can see the corresponding isolated blue dot. The blue region corresponds to the spectrum of Markovian Leslie matrices whenever . The red region corresponds to the spectrum of Markovian Leslie matrices whenever .
Continuous dependency of the component with respect to the : Define the set of all the elements satisfying Assumption A1. For each we define
Theorem A2.
Consider a sequence and a point (i.e., all satisfying Assumption A1). Assume that
then
where
Proof.
We have
Subtracting the two above quantities, we obtain
(A4) which is also equivalent to
hence,
Setting
and so
Now since
we deduce that
hence, for all large enough (i.e., satisfying )
and the proof is completed. □
Appendix C. Identification of the Phenomenological Model
Here we assume that the daily number of reported cases has the following form
| (A5) |
where are non null, and are pairwise distinct.
If we assume to know for all positive integer values then we can compute the discrete Laplace transform
which is well defined for all such that
By using (A5), we obtain
whenever .
Let be an integer such that
we obtain
The Laplace transform could be used to identify the unknown parameters in (A5). Then by combining this idea with linear regression of , we could identify the parameters , then step by step compute all the parameters of in (A5).
In practice, we only know on a finite time interval . In that case, we can define the truncated Laplace transform as
and we have by (27)
The Laplace transform does not permit to detect the eigenvalues (we tested without success some examples with values of complex numbers coming from the present article). Identification of the eigenvalues , whenever is known only on a finite time interval, seems to be an open intriguing question.
Appendix D. About Residual 2 (t) in Section 3.3
In Figure A2, we observe that average of is close to 0, but its histogram does not have the shape of a normal distribution. So, there might be some residual information in .
Figure A2.
In this figure, we plot .
Author Contributions
Conceptualization, J.D. and P.M.; methodology, P.M.; software, P.M.; writing—original draft preparation, J.D and P.M.; writing—review and editing, J.D and P.M.; All authors have read and agreed to the published version of the manuscript.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
There is no subject involved in the present study.
Data Availability Statement
No data were produced for this study.
Conflicts of Interest
The authors declare no conflict of interest.
Funding Statement
This research received no external funding.
Footnotes
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Demongeot J., Griette Q., Maday Y., Magal P. Kermack-McKendrick model with age of infection starting from a single or multiple cohorts of infected patients. arXiv. 20222205.15634 [Google Scholar]
- 2.Kermack W.O., McKendrick A.G. Contributions to the mathematical theory of epidemics: II. Proc. R. Soc. Lond. Ser. B. 1932;138:55–83. [Google Scholar]
- 3.Chao D.L., Halloran M.E., Obenchain V.J., Longini I.M., Jr. FluTE, a publicly available stochastic influenza epidemic simulation model. PLoS Comput. Biol. 2010;6:e1000656. doi: 10.1371/journal.pcbi.1000656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Itoh Y., Shichinohe S., Nakayama M., Igarashi M., Ishii A., Ishigaki H., Ishida H., Kitagawa N., Sasamura T., Shiohara M., et al. Emergence of H7N9 Influenza A Virus Resistant to Neuraminidase Inhibitors in Nonhuman Primates. Antimicrob. Agents Chemother. 2015;59:4962–4973. doi: 10.1128/AAC.00793-15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Alvarez L., Colom M., Morel J.D., Morel J.M. Computing the daily reproduction number of COVID-19 by inverting the renewal equation using a variational technique. Proc. Natl. Acad. Sci. USA. 2021;118:e2105112118. doi: 10.1073/pnas.2105112118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Alvarez L., Morel J.-D., Morel J.-M. Modeling COVID-19 incidence by the renewal equation after removal of administrative bias and noise. Biology. 2022;11:540. doi: 10.3390/biology11040540. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Demongeot J., Griette Q., Magal P. SI epidemic model applied to COVID-19 data in mainland China. R. Soc. Open Sci. 2020;7:201878. doi: 10.1098/rsos.201878. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Griette Q., Demongeot J., Magal P. What can we learn from COVID-19 data by using epidemic models with unidentified infectious cases? Math. Biosci. Eng. 2021;19:537–594. doi: 10.3934/mbe.2022025. [DOI] [PubMed] [Google Scholar]
- 9.Griette Q., Demongeot J., Magal P. A robust phenomenological approach to investigate COVID-19 data for France. Math. Appl. Sci. Eng. 2021;2:149–218. [Google Scholar]
- 10.Nishiura H. Time variations in the transmissibility of pandemic influenza in Prussia, Germany, from 1918–19. Theor. Biol. Med. Model. 2007;4:20. doi: 10.1186/1742-4682-4-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Nishiura H., Chowell G. The Effective Reproduction Number as a Prelude to Statistical Estimation of Time-Dependent Epidemic Trends. In: Epidemiology G., Chowell J.M., Hyman L.M., Bettencourt A., Castillo-Chavez C., editors. Mathematical and Statistical Estimation Approaches. Springer; Dordrecht, The Netherlands: 2009. pp. 103–121. [Google Scholar]
- 12.Bakhta A., Boiveau T., Maday Y., Mula O. Epidemiological forecasting with model reduction of compartmental models. application to the COVID-19 pandemic. Biology. 2020;10:22. doi: 10.3390/biology10010022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Waku J., Oshinubi K., Demongeot J. Maximal reproduction number estimation and identification of transmission rate from the first inflection point of new infectious cases waves: COVID-19 outbreak example. Math. Comput. Simul. 2022;198:47–64. doi: 10.1016/j.matcom.2022.02.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Pan Y., Zhang D., Yang P., Poon L.L.M., Wang Q. Viral load of SARS-CoV-2 in clinical samples. Lancet Infect. Dis. 2020;20:411–412. doi: 10.1016/S1473-3099(20)30113-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Whittle P. Hypothesis Testing in Time Series Analysis. Almquist Wicksell. :1951. [Google Scholar]
- 16.Whittle P. Prediction and Regulation. English Universities Press; London, UK: 1963. [Google Scholar]
- 17.Wiener N. Extrapolation, Interpolation, and Smoothing of Stationary Time Series. MIT Press; Cambridge, MA, USA: 1949. [Google Scholar]
- 18.Chan K.S., Tong H. A note on certain integral equations associated with non-linear time series analysis. Probab. Th. Rel. Fields. 1986;73:153–158. [Google Scholar]
- 19.Lim K.S., Tong H. A statistical approach to difference-delay equation modelling in ecology—Two case studies. J. Time Ser. Anal. 1983;4:239–267. doi: 10.1111/j.1467-9892.1983.tb00372.x. [DOI] [Google Scholar]
- 20.Priestley M.B. Spectral Analysis and Time Series. Academic Press; Cambridge, MA, USA: 1981. [Google Scholar]
- 21.Ramsay J.O. Monotone Regression Splines in Action. Stat. Sci. 1988;3:425–441. doi: 10.1214/ss/1177012761. [DOI] [Google Scholar]
- 22.Ramsay J., Hooker G. Dynamic Data Analysis: Modeling Data with Differential Equations. Springer; New York, NY, USA: 2017. [Google Scholar]
- 23.Tong H. Non-Linear Time Series: A Dynamical System Approach. Oxford University Press; Oxford, UK: 1990. [Google Scholar]
- 24.Tuan P.D. The estimation of parameters for autoregressive moving average models. J. Time Ser. Anal. 1984;5:53–68. [Google Scholar]
- 25.Priestley M.B. Evolutionary spectra and non-stationary processes. J. R. Stat. Soc. Ser. 1965;27:204–229. doi: 10.1111/j.2517-6161.1965.tb01488.x. [DOI] [Google Scholar]
- 26.Malthus T.R. An Essay on the Principle of Population as It Affects the Future Improvement of Society, with Remarks on the Speculations of Mr. Godwin, M. Condorcet, and Other Writers. J. Johnson; London, UK: 1798. [Google Scholar]
- 27.Fisher R.A. The Wave of Advance of Advantageous Genes. Ann. Eugen. 1937;7:353–369. doi: 10.1111/j.1469-1809.1937.tb02153.x. [DOI] [Google Scholar]
- 28.Lambert J.H. Beytrage Zum Gebrauche Der Mathematik Und Deren Anwendung. Verlage des Buchladens der Realschule; Berlin, Germany: 1765. p. 72. [Google Scholar]
- 29.Euler L. Recherches générales sur la mortalité et la multiplication du genre humain. MéMoires L’AcadéMie Des Sci. Berl. 1767;16:144–164. [Google Scholar]
- 30.Lotka A.J. Relation between birth rates and death rates. Science. 1907;26:121–130. doi: 10.1126/science.26.653.21.b. [DOI] [PubMed] [Google Scholar]
- 31.Leslie P.H. On the use of matrices in certain population mathematics. Biometrika. 1945;33:183–212. doi: 10.1093/biomet/33.3.183. [DOI] [PubMed] [Google Scholar]
- 32.Hahn G.M. Mammalian cell populations. Math. Biosci. 1970;6:295–315. doi: 10.1016/0025-5564(70)90069-6. [DOI] [Google Scholar]
- 33.Cazelles B., Chavez M., Magny G.C.D., Guégan J.F., Hales S. Time-dependent spectral analysis of epidemiological time-series with wavelets. J. R. Soc. Interface. 2007;4:625–636. doi: 10.1098/rsif.2007.0212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Robinson E.A. Predictive deconvolution of time series with application to seismic exploration. Geophysics. 1967;32:418–484. doi: 10.1190/1.1439873. [DOI] [Google Scholar]
- 35.Peacock K.L., Treitel S. Predictive deconvolution: Theory and practice. Geophysics. 1969;34:155–169. doi: 10.1190/1.1440003. [DOI] [Google Scholar]
- 36.Robinson E.A., Treitel S. Geophysical Signal Analysis. Prentice-Hill, Inc.; Englewood Cliffs, NJ, USA: 1980. [Google Scholar]
- 37.Walden A.T., Hosken J.W.J. The nature of non-Gaussianity of primary reflection coefficients and its significance for deconvolution. Geophys. Prosp. 1986;34:1038–1066. doi: 10.1111/j.1365-2478.1986.tb00512.x. [DOI] [Google Scholar]
- 38.Vinod D., Cherstvy A.G., Wang W., Metzler R., Sokolov I.M. Nonergodicity of reset geometric Brownian motion. Phys. Rev. E. 2022;105:L012106. doi: 10.1103/PhysRevE.105.L012106. [DOI] [PubMed] [Google Scholar]
- 39.Ritschel S., Cherstvy A.G., Metzler R. Universality of delay-time averages for financial time series: Analytical results, computer simulations, and analysis of historical stock-market prices. J. Phys. Complex. 2021;2:045003. doi: 10.1088/2632-072X/ac2220. [DOI] [Google Scholar]
- 40.Data from WHO. [(accessed on 20 July 2022)]. Available online: https://COVID19.who.int/WHO-COVID-19-global-data.csv.
- 41.Demongeot J., Oshinubi K., Rachdi M., Seligmann H., Thuderoz F., Waku J. Estimation of Daily Reproduction Numbers during the COVID-19 Outbreak. Computation. 2021;9:109. doi: 10.3390/computation9100109. [DOI] [Google Scholar]
- 42.Powered by the Institute of Global Health, Faculty of Medicine, University of Geneva and the Swiss Data Science Center, ETH Zürich-EPFL. [(accessed on 20 July 2022)]. Available online: https://renkulab.shinyapps.io/COVID-19-Epidemic-Forecasting/_w_850fb011/?tab=jhu_pred&country=Japan.
- 43.Cori A., Ferguson N.M., Fraser C., Cauchemez S. A new framework and software to estimate time-varying reproduction numbers during epidemics. Am. J. Epidemiol. 2013;178:1505–1512. doi: 10.1093/aje/kwt133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Scire J., Nadeau S., Vaughan T., Brupbacher G., Fuchs S., Sommer J., Koch K.N., Misteli R., Mundorff L., Götz T., et al. Reproductive number of the COVID-19 epidemic in Switzerland with a focus on the Cantons of Basel-Stadt and Basel-Landschaft. Swiss Med. Wkly. 2020;150:w20271. doi: 10.4414/smw.2020.20271. [DOI] [PubMed] [Google Scholar]
- 45.Kawasuji H., Takegoshi Y., Kaneda M., Ueno A., Miyajima Y., Kawago K., Fukui Y., Yoshida Y., Kimura M., Yamada H., et al. Transmissibility of COVID-19 depends on the viral load around onset in adult and symptomatic patients. PLoS ONE. 2020;15:e0243597. doi: 10.1371/journal.pone.0243597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kim S.E., Jeong H.S., Yu Y., Shin S.U., Kim S., Oh T.H., Kim U.J., Kang S.J., Jang H.C., Jung S.I., et al. Viral kinetics of SARS-CoV-2 in asymptomatic carriers and presymptomatic patients. Int. J. Infect. Dis. 2020;95:441–443. doi: 10.1016/j.ijid.2020.04.083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Ioannidis J.P., Cripps S., Tanner M.A. Forecasting for COVID-19 has failed. Int. J. Forecast. 2022;38:423–438. doi: 10.1016/j.ijforecast.2020.08.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Ducrot A., Griette Q., Liu Z., Magal P. Differential Equations and Population Dynamics I: Introductory Approaches. Springer Nature; Berlin, Germany: 2022. [Google Scholar]
- 49.Kirkland S. On the spectrum of a Leslie matrix with a near-periodic fecundity pattern. Linear Algebra Its Appl. 1993;178:261–279. doi: 10.1016/0024-3795(93)90345-O. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
No data were produced for this study.























