Abstract
In this work, the time series of growth rates regarding confirmed cases and deaths of COVID-19 for several sampled countries are investigated via an introduction of an orthonormal basis. This basis, which is served as the feature benchmark, reveals the hidden features of COVID-19 via the magnitude of Fourier coefficients. These coefficients are ranked in the form of ranking vectors for all the sampled countries. Based on these and Manhattan metric, we then perform spectral clustering to categorise the countries. Unlike the classical cosine similarity analysis which, relatively speaking, is a composite index and hard to identify the features of the categorised countries, spectral analysis delves into the internal structures or dynamical trend of the time series. This research shows there is no single feature that dominates the trend of the growth rates. It also reveals that results from the spectral analysis are different from the ones of cosine similarity. In the end, some approximated values of the confirmed cases and deaths are also calculated by the spectral analysis.
Introduction
For the time being, there are over 180 million confirmed cases and around 4 million deaths of COVID-19 [1]. The variants of COVID-19 are still in full swing and spreading across the continents [2, 3]. The spread of this pandemic is studied by many researchers [4–7]. There are many ways to look into the behaviours of the viruses or the pandemic itself [8, 9], including for the efficacy of travel ban or lockdown [10]. Nowadays more and more papers are focusing on the efficacy and effectiveness of vaccine against all sorts of variants of SARS-CoV-2 [11, 12]. Some researches even help shed light on resolving the pandemic via herd immunity [13] or map out the dynamical trajectory of the viruses [14, 15]. Among those research, there are plenty of qualitative and quantitative research methods, in particular statistical methods: regression-related models [16], Mann-Whitney U tests, Mann-Kendal tests, Spearman’s rho, etc [17, 18]; factorial design [19]; artificial intelligence (AI) based methods [20, 21] and some deep learning techniques in COVID-19 diagnosis [22, 23]. Coupling with AI, an automatic reasoning for searching the hidden features of the trend of COVID-19 is also vital [24]. Some researches even have established the relations between cases and deaths of COVID-19 from demographic, economic and social perspectives [25].
An orthonormal basis [26] (precisely ), which is motivated by Fourier analysis [27], is employed. The underlying frequencies of data are taken into consideration. The COVID-19 database [28] which recorded the weekly COVID-19 cases and deaths from Week 15 to Week 51 (37 weeks in total) is utilised. By filtering out some non-essential data (countries), we obtain 43 countries as the research targets. By calculating the 36 (from Week 15 to Week 50) growth rates of the cases and deaths for each sampled country, we have a national case vector and a death vector. By transforming these two vectors into a set of coefficients, which is the result of the inner products via , we start to rank the coefficients by their face values and form ranking vectors for each country. The ranks indicate the strength (relation) between the growth rates and the underlying frequencies—a larger coefficient will be assigned a larger rank. Then the Manhattan metric [29] is applied to measure the distances between all the ranking vectors and yield a Manhattan distance matrix. Based on this matrix, we associate each country with its closer and further neighbours via the minimal pairing and maximal pairing. Then we re-analyse the collected growth rates of confirmed cases and deaths with another typical approach: cosine similarity. Finally, an approximation method based on spectral analysis to further predict the development of confirmed cases and deaths is devised.
In sum, the current work shows that:
the patterned evolutionary correlation between countries not random, i.e., there are some fundamental factors that contribute to such relation;
the correlated patterns for cases and deaths between countries bear no similarity at all;
there is a strong discrepancy between evolution of cases and the one of deaths;
the development of confirmed cases and deaths among countries are monotonic—some might increase the total cases, while some would decrease the total ones, if the detected features are preserved.
Based on the clustering technique in this study, it shall offer some knowledge for the policymakers in adopting the sensible measures for controlling the pandemic.
Methodology and procedures
By expanding the concept of the Fourier analysis, an orthogonal basis - which is served as the feature benchmarks to conduct the current study is introduced. Let denote the set of positive integers and denote the set of real numbers. For any , is used to denote its i’th element, is used to denote its length and is used to denote the Euclidean norm. Assume , where N stands for a natural number. is used to denote its growth vector, i.e.,
Observe that . This growth vector is the main research target, since we study the (weekly) grow rates of cases and deaths regarding COVID-19. For any two vectors , , or is used to denote their inner product.
Orthogonal basis Define real functions and and , where . Define
where . By some manipulation of mathematical operations, is proved to be an orthogonal basis for all natural number N.
Manhattan metric and cosine similarity Let be arbitrary. A Manhattan metric d over is defined by [29]
| 1 |
It faithfully adds up all the projected distances with respect to every individual dimensions. Cosine similarity between them is defined by
| 2 |
This is a classical approach measuring the relation or similarity between two vectors.
Data The weekly data (up to Week 51, 2020) of the reported COVID-19 total confirmed cases and deaths worldwide [28] are download. The samples are the countries. To reduce biased sampling which derives from the insufficient population, inadequate healthcare support and missing data, only these countries satisfying the following three criteria simultaneously would be qualified for the sampled countries:
countries whose population are more than ten millions;
countries whose healthcare systems are ranked [30] among the top 100;
countries whose data from Week 15 to Week 51, Year 2020 are available.
After filtering out the non-essential samples by the above criteria, we obtain 43 countries as shown in Table 1—each of which contains 37 weekly data (from Week 15 to Week 51) of total confirmed cases and deaths.
Table 1.
Sampled countries with labels
| No. | 1 | 2 | 3 | 4 |
| Country | Algeria | Argentina | Australia | Bangladesh |
| No. | 5 | 6 | 7 | 8 |
| Country | Belgium | Benin | Canada | Chile |
| No. | 9 | 10 | 11 | 12 |
| Country | Cuba | Czechia | Congo | Egypt |
| No. | 13 | 14 | 15 | 16 |
| Country | France | Germany | Greece | Guatemala |
| No. | 17 | 18 | 19 | 20 |
| Country | Indonesia | Iran | Italy | Japan |
| No. | 21 | 22 | 23 | 24 |
| Country | Jordan | Kazakhstan | Malaysia | Mexico |
| No. | 25 | 26 | 27 | 28 |
| Country | Morocco | Netherlands | Philippines | Poland |
| No. | 29 | 30 | 31 | 32 |
| Country | Portugal | Romania | Saudi Arabia | Senegal |
| No. | 33 | 34 | 35 | 36 |
| Country | South Korea | Spain | Sri Lanka | Sweden |
| No. | 37 | 38 | 39 | 40 |
| Country | Thailand | Tunisia | Turkey | Ukraine |
| No. | 41 | 42 | 43 | |
| Country | United Kingdom | United States | Venezuela |
These 43 labelled countries are sampled based on three criteria: over 10 million population, top 100 healthcare system, and available data for the set periods
Procedures The current work is conducted result spectral analysis by the following procedures:
Prepare and compile the weekly accumulated confirmed cases and deaths from Week 15 to Week 51 in Year 2020 with respect to the sampled 43 countries. The sampled data are shown in Table 6.
- Calculate the t-th week’s growth rates for confirmed cases of COVID-19 by the formula
where denotes the total number of confirmed cases at Week t for country i. For my analytical purpose, the denominator is deliberately added by 1 to avoid divisor being 0. Similarly, we could calculate the t-th week’s growth rates of deaths for country i by . Then for each country i, a growth vector and are formed. The calculated results are tabulated in Table 7.3 Choose an orthonormal basis and rename them by .
Calculate the magnitude vector for each growth vector by and each by , where . The results and are presented in Table 8;
Rank each element in by a natural number according to its face value and form a ranking vector. The one with higher face value would be assigned a higher rank. Follow the same method for ranking each . The results are presented in Table 9.
Calculate the Manhattan distance between all the ranking vectors. The resulting distance matrices are presented in Table 2.
Find the minimal pairs (or nearest neighbours) and maximal pairs (or furthest neighbours) for all the countries via with least distance via above Manhattan distance matrices. The visualised results are presented in Fig. 4.
Table 6.
Raw weekly total cases and total deaths of COVID1-9
| Country | Week | ||||||
|---|---|---|---|---|---|---|---|
| 15 | 16 | 17 | 49 | 50 | 51 | ||
| 1 | 594 | 715 | 753 | 6031 | 3850 | 3101 | |
| 2 | 649 | 727 | 837 | 44299 | 35067 | 48955 | |
| 3 | 578 | 290 | 101 | 72 | 66 | 167 | |
| 4 | 533 | 1835 | 2960 | 15138 | 12988 | 10180 | |
| 40 | 1469 | 2672 | 3168 | 90627 | 87360 | 70327 | |
| 41 | 32205 | 32027 | 32830 | 105915 | 126161 | 190744 | |
| 42 | 219936 | 202116 | 206223 | 1373677 | 1499756 | 1588085 | |
| 43 | 33 | 75 | 69 | 2402 | 2735 | 2898 | |
| Country | Week | ||||||
|---|---|---|---|---|---|---|---|
| 15 | 16 | 17 | 49 | 50 | 51 | ||
| 1 | 141 | 82 | 50 | 106 | 80 | 70 | |
| 2 | 49 | 39 | 52 | 1297 | 996 | 1231 | |
| 3 | 25 | 9 | 13 | 1 | 0 | 0 | |
| 4 | 25 | 57 | 54 | 229 | 214 | 228 | |
| 40 | 46 | 58 | 68 | 1375 | 1659 | 1418 | |
| 41 | 6440 | 6166 | 5512 | 3000 | 2925 | 3231 | |
| 42 | 12461 | 18574 | 14194 | 15437 | 16867 | 18493 | |
| 43 | 4 | 0 | 1 | 25 | 30 | 39 | |
This is a partial raw data for weekly confirmed cases (upper block) and deaths (lower block) of COVID-19 for 15th week 2020 to 51st week 2020 for 43 sampled countries. The sampling is based on the size of population in a country, the availability of COVID-19 data and the healthcare system in a country
Table 7.
Weekly growth rates of cases and deaths of COVID-19
| Country | Week | ||||||
|---|---|---|---|---|---|---|---|
| 15 | 16 | 17 | 48 | 49 | 50 | ||
| 1 | 0.2 | 0.05 | 0.45 | ||||
| 2 | 0.12 | 0.15 | 0.2 | 0.4 | |||
| 3 | 0.00 | 1.51 | |||||
| 4 | 2.44 | 0.61 | 0.36 | 0.00 | |||
| 40 | 0.82 | 0.19 | 0.04 | ||||
| 41 | 0.03 | 0.01 | 0.19 | 0.51 | |||
| 42 | 0.02 | 0.21 | 0.09 | 0.06 | |||
| 43 | 1.24 | 0.09 | 0.14 | 0.06 | |||
| Country | Week | ||||||
|---|---|---|---|---|---|---|---|
| 15 | 16 | 17 | 48 | 49 | 50 | ||
| 1 | 0.00 | 0.00 | 0.00 | ||||
| 2 | 0.02 | 0.01 | 0.00 | 0.01 | |||
| 3 | 0.01 | 0.01 | 0.00 | ||||
| 4 | 0.06 | 0.00 | 0.00 | 0.00 | 0.00 | ||
| 40 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| 41 | 0.00 | 0.00 | 0.00 | ||||
| 42 | 0.03 | 0.00 | 0.00 | 0.00 | |||
| 43 | 0.01 | 0.00 | 0.00 | 0.00 | |||
This table partially presents the results of the growth rates of weekly cases and deaths from Week 15 to Week 50 for 43 sampled countries. Each i row vector is whose elements are defined in Equation 3
Table 8.
Fourier coefficients, or inner products, for 36 frequencies (or features) with respect to weekly growth rates of cases (upper block) and deaths (lower block)
| Country | Feature | ||||||
|---|---|---|---|---|---|---|---|
| I | |||||||
| 1 | 0.2 | 0.2 | 0.2 | 0.03 | 0.35 | ||
| 2 | 0.82 | 0.69 | 0.13 | 0.64 | |||
| 3 | 0.51 | 0.09 | 0.86 | ||||
| 4 | 0.8 | 0.88 | 0.76 | 0.59 | 0.78 | 0.8 | |
| 40 | 0.66 | 0.35 | 0.16 | 0.45 | |||
| 41 | 0.12 | 0.09 | 0.4 | ||||
| 42 | 0.01 | 0.04 | 0.14 | 0.34 | |||
| 43 | 1.62 | 0.9 | 0.21 | 1.35 | 1.84 | ||
| Country | Feature | ||||||
|---|---|---|---|---|---|---|---|
| I | |||||||
| 1 | I | ||||||
| 2 | 0.02 | 0 | 0 | 0 | 0.01 | ||
| 3 | 0.04 | 0.01 | 0.02 | 0.04 | |||
| 4 | 0.01 | 0.02 | 0.01 | 0.01 | 0.01 | 0.02 | |
| 40 | 0.01 | 0 | 0.01 | 0 | 0 | ||
| 41 | 0.03 | ||||||
| 42 | 0 | 0 | 0 | ||||
| 43 | |||||||
Table 9.
Ranking the Fourier coefficients for cases (upper block) and deaths (lower block) calculated in Table 8
| Country | Feature | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| I | |||||||||||||
| 1 | 27 | 25 | 26 | 4 | 30 | 23 | 36 | 15 | 2 | 16 | 8 | 33 | |
| 2 | 36 | 35 | 22 | 20 | 6 | 26 | 2 | 5 | 14 | 1 | 16 | 34 | |
| 3 | 19 | 30 | 2 | 5 | 15 | 34 | 32 | 16 | 1 | 8 | 22 | 35 | |
| 4 | 34 | 36 | 32 | 30 | 29 | 17 | 6 | 2 | 3 | 31 | 33 | 35 | |
| 40 | 36 | 2 | 33 | 7 | 26 | 10 | 23 | 17 | 18 | 21 | 1 | 35 | |
| 41 | 10 | 2 | 24 | 28 | 4 | 34 | 12 | 16 | 36 | 21 | 1 | 35 | |
| 42 | 20 | 23 | 26 | 2 | 7 | 14 | 36 | 17 | 1 | 30 | 3 | 35 | |
| 43 | 35 | 30 | 9 | 6 | 4 | 5 | 2 | 8 | 11 | 20 | 33 | 36 | |
| Country | Feature | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| I | |||||||||||||
| 1 | 15 | 1 | 5 | 7 | 10 | 18 | 12 | 4 | 2 | 19 | 16 | 14 | |
| 2 | 36 | 27 | 16 | 33 | 7 | 28 | 18 | 13 | 23 | 8 | 19 | 31 | |
| 3 | 31 | 16 | 1 | 26 | 14 | 10 | 22 | 36 | 4 | 24 | 30 | 2 | |
| 4 | 13 | 36 | 28 | 4 | 7 | 20 | 1 | 21 | 10 | 8 | 30 | 35 | |
| 40 | 34 | 23 | 36 | 12 | 24 | 30 | 14 | 26 | 33 | 16 | 1 | 28 | |
| 41 | 1 | 2 | 4 | 7 | 11 | 6 | 31 | 34 | 36 | 35 | 10 | 3 | |
| 42 | 1 | 6 | 4 | 5 | 7 | 8 | 34 | 36 | 16 | 13 | 2 | 3 | |
| 43 | 34 | 28 | 8 | 7 | 13 | 22 | 17 | 26 | 18 | 19 | 33 | 36 | |
The higher the coefficients are, the higher the ranks are. The higher ranks indicate the main features of weekly growth rates of COVID-19 in terms of the chosen 36 frequencies
Table 2.
Manhattan distance matrices with respect to COVID-19 cases (top block) and COVID-19 deaths (bottom block)
| Country | Country | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 40 | 41 | 42 | 43 | ||
| 1 | 0 | 386 | 332 | 374 | 402 | 376 | 440 | 450 | |
| 2 | 386 | 0 | 416 | 312 | 400 | 398 | 428 | 436 | |
| 3 | 332 | 416 | 0 | 484 | 480 | 414 | 374 | 374 | |
| 4 | 374 | 312 | 484 | 0 | 256 | 376 | 514 | 454 | |
| 40 | 402 | 400 | 480 | 256 | 0 | 420 | 480 | 428 | |
| 41 | 376 | 398 | 414 | 376 | 420 | 0 | 404 | 436 | |
| 42 | 440 | 428 | 374 | 514 | 480 | 404 | 0 | 384 | |
| 43 | 450 | 436 | 374 | 454 | 428 | 436 | 384 | 0 | |
| Country | Country | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 40 | 41 | 42 | 43 | ||
| 1 | 0 | 470 | 448 | 356 | 464 | 364 | 308 | 454 | |
| 2 | 470 | 0 | 370 | 394 | 416 | 440 | 448 | 496 | |
| 3 | 448 | 370 | 0 | 486 | 436 | 398 | 498 | 438 | |
| 4 | 356 | 394 | 486 | 0 | 348 | 444 | 478 | 496 | |
| 40 | 464 | 416 | 436 | 348 | 0 | 504 | 436 | 466 | |
| 41 | 364 | 440 | 398 | 444 | 504 | 0 | 392 | 504 | |
| 42 | 308 | 448 | 498 | 478 | 436 | 392 | 0 | 430 | |
| 43 | 454 | 496 | 438 | 496 | 466 | 504 | 430 | 0 | |
Both ranking distance matrices are calculated from Table 9 for 43 countries. The metric measures the feature distance between countries
Fig. 4.
Minimal and maximal pairings for cases and deaths. These figures are based on Table 2. They reveal the distances between ranked Fourier coefficients of the 43 countries. The lower the distance is, the closer the features are
Results
In correspondence to the procedures described in Sect. 2, we embark on data analysis and produce the results in this section with the help of some R 4.1.0 programs [31]. The raw data for weekly total confirmed cases and deaths are further processed and graphed in Figs. 1 and 6. Due to limited space, we choose labelled countries 1 to 4 and 40 to 43 throughout this study for demonstrative purpose. The visualised growth rates for confirmed cases and deaths are graphed in Fig. 2. As expected, the trends for confirmed cases and deaths are very similar, but different in scale. However, this synchrony falls apart as more features and factors are taken into consideration
Fig. 1.
Trend of weekly total cases of COVID1-19. These plots correspond directly to Table 6. It depicts the weekly total cases of COVID-19 for the labelled countries 1 to 4 and 40 to 43
Fig. 6.

Trend of weekly total deaths of COVID1-19. These plots correspond directly to Table 6
Fig. 2.

Trend of growth rates of weekly total cases and deaths of COVID1-19. The solid line concerns the confirmed cases, while the dashed line concerns the deaths in each plot. These plots correspond directly to Table 7
To further reveal the characteristics of the trends, we delve into the growth rates of the total numbers. The visualised results are graphed in Fig. 2.
This visualisation is not sufficient enough to look into the underlying features of the changes. To investigate the hidden features of the trend, we avail of spectral analysis to decompose the national trends into different magnitude of frequencies. The reason that the growth rates are adopted for decomposition rather than the total cases is for the further comparison and analysis between countries, since the growth rate in essence is already scaling the data and that makes the decompositions comparable. The visualised results are graphed in Fig. 3.
Fig. 3.
Inner product of growth rates of weekly total cases and deaths of COVID1-19. The solid line concerns the confirmed cases, while the dashed line concerns the deaths in each plot. These plots correspond directly to Table 8
Since the data collected might still contain some noise, we process the resulting Fourier coefficients by ranking. An alternative is to compare the coefficients directly by some similarity measures, but this approach might contain some noise and distort the results. The ranks are assigned directly by comparing the values of the coefficients. The higher values will be assigned higher ranks. After this ranking, we start to measure the distances for the ranking vectors of the countries via Manhattan metric, which is a straightforward metric reflecting the difference between two vectors. The calculated results are shown in Table 2.
Clustering analysis Based on Table 2, we analyse the distance matrix by categorising the countries with similar features by two methods for contrasts: minimal pairing and maximal pairing. The minimal one reveals the closer neighbours who share the similar features, whereas the maximal one discloses the further neighbours who bear the most dissimilarities. The results are collectively presented in Fig. 4.
Cosine similarity For further comparison, we contrast the spectral analysis with typical cosine similarity analysis. The targets for cosine similarity are the growth rates of confirmed cases and deaths. In other words, it calculates the similarities between the graphs in Fig. 3. Cosine similarity, which is defined in Equation 2, is explicitly or implicitly used in many fields, since it is a composite index which yields a much intuitive interpretation via geometrical notions. Since it is a static indicator, the analysis per se does not really reveal the internal difference between the structures or trends. The results are presented in Table 3.
Table 3.
Typical cosine similarities of COVID-19 cases (top block) and deaths (bottom block)
| Country | Country | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 40 | 41 | 42 | 43 | ||
| 1 | 1 | 0.09 | 0.39 | 0.19 | 0.19 | 0.64 | 0.04 | ||
| 2 | 0.09 | 1 | 0.12 | 0.27 | 0.09 | 0.03 | 0.04 | 0.7 | |
| 3 | 0.39 | 0.12 | 1 | 0.41 | 0.02 | ||||
| 4 | 0.19 | 0.27 | 1 | 0.59 | 0.46 | ||||
| 40 | 0.19 | 0.09 | 0.59 | 1 | 0.17 | 0.11 | 0.14 | ||
| 41 | 0.03 | 0.17 | 1 | 0.03 | |||||
| 42 | 0.64 | 0.04 | 0.41 | 0.11 | 0.03 | 1 | 0.03 | ||
| 43 | 0.04 | 0.7 | 0.02 | 0.46 | 0.14 | 0.03 | 1 | ||
| Country | Country | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 40 | 41 | 42 | 43 | ||
| 1 | 1 | 0.1 | 0.09 | 0.09 | 0.79 | ||||
| 2 | 0.1 | 1 | 0.3 | 0.29 | |||||
| 3 | 0.09 | 0.3 | 1 | 0.15 | 0.03 | 0.09 | |||
| 4 | 1 | 0.31 | 0.52 | ||||||
| 40 | 0.31 | 1 | 0.12 | ||||||
| 41 | 0.09 | 0.15 | 1 | 0.43 | 0.03 | ||||
| 42 | 0.03 | 0.52 | 0.12 | 0.43 | 1 | ||||
| 43 | 0.79 | 0.29 | 0.09 | 0.03 | 1 | ||||
Summary
From Figs. 1 and 6, we observe that the trends between confirmed cases and deaths are pretty much the same.
From Fig. 2, the trends of growth rates for confirmed cases and deaths are not similar to each other. Furthermore, the growth rates for deaths are much stabler than the ones for the confirmed cases;
From Fig. 3, one could observe that the Fourier coefficients for confirmed cases and deaths are in line with each other; there is no obvious leading frequencies that decide the trend of growth rates;
From Fig. 4, we obverse that the graphs are disconnected from the spectral features’ point of view. The sub-graphs indicate the closeness between their features. Each group shall share some hidden properties which are not necessary linking to the geographical locations;
From Fig. 4, we find the maximal clusters do not form a connected graph—this indicates the characteristic of the grouping is distinct and clear-cut, i.e., there are some separable features among the countries;
Comparing Fig. 5 with Fig. 4, we find cosine similarities lead to few representing countries than the spectral-featured counterparts; in addition, the confirmed cases have even fewer representing countries than the death—this shall indicate the deaths might be due to the circumstances of each individual countries—for example, the healthcare systems;
From Table 4, the closest pair is the minimal pairing for confirmed case via spectral analysis and the maximal pairing for deaths via cosine similarity. This indicates there two methods indeed produce substantially different results (if the two methods are similar to each other, we shall expect a closer relation between fq_mc and cs_mc than the one between fq_mc and cs_Mc).
Fig. 5.
Minimal and maximal pairings for cases and deaths by typical cosine values. These figures are based on Table 3. They reveal the similarities between weekly growth rates of COVID-19 for the 43 countries
Table 4.
Distances between adjacency matrices
| fq_mc | fq_Mc | fq_md | fq_Md | cs_mc | cs_Mc | cs_md | cs_Md | |
|---|---|---|---|---|---|---|---|---|
| fq_mc | 0 | 84 | 86 | 86 | 86 | 36 | 86 | 80 |
| fq_Mc | 84 | 0 | 82 | 82 | 76 | 82 | 84 | 84 |
| fq_md | 86 | 82 | 0 | 80 | 84 | 86 | 84 | 82 |
| fq_Md | 86 | 82 | 80 | 0 | 84 | 86 | 70 | 86 |
| cs_mc | 86 | 76 | 84 | 84 | 0 | 86 | 84 | 86 |
| cs_Mc | 36 | 82 | 86 | 86 | 86 | 0 | 86 | 80 |
| cs_md | 86 | 84 | 84 | 70 | 84 | 86 | 0 | 86 |
| cs_Md | 80 | 84 | 82 | 86 | 86 | 80 | 86 | 0 |
The distances for the eight minimal/maximal pairings for confirmed cases and deaths via spectral analysis and cosine similarities are computed. fq_mc, fq_Mc, fq_md and fq_Md are the minimal pairings of the confirmed cases, the maximal pairings of the confirmed cases, the minimal pairing of the deaths, and the maximal pairings of the deaths by frequency, respectively. Similarly, cs_mc, cs_Mc, cs_md and cs_Md are the minimal pairings of the confirmed cases, the maximal pairings of the confirmed cases, the minimal pairing of the deaths, and the maximal pairings of the deaths by cosine similarity, respectively
Prediction Now we show how to further approximate the weekly COVID-19 confirmed cases and deaths based on obtained features. In this section, for any vector , we use to denote its j-th element in . Let and denote the total confirmed cases and total deaths by Week t for Country i, respectively. Define the growth rates and . For each , let vector
and let vector
Now we want to estimate the values, which extend the values in Table 6, of and based on the derived features. To preserve the features as much as possible, the decision is made by choosing the such that , or for computational purpose , is minimized. Then by the definition, is computed. Hence the general problem could be formulated by finding the optimal such that
is minimised. The implementation is described in the following claim:
Claim 1
and for any arbitrary .
where ; where ; and where
Proof
Define . Then
Furthermore, by the first order derivative
and second order derivative the result follows immediately.
Corollary 1
and for any arbitrary
where ; where ;
Proof
The result could be deduced by the same inference above.
Based on this alternating inductive steps, we predict next four weeks’ confirmed cases and deaths for the 43 sampled countries. The results are partially presented in Table 5 and fully shown in Figs. 7 and 8. From these results, we find that if the features are preserved, then the development of confirmed cases and deaths shall not be monotonic among these countries. This shall offer an explanation of why the pandemic is not synchronous among the countries.
Table 5.
Prediction of confirmed cases and deaths
| Country | Week | |||
|---|---|---|---|---|
| 52 | 53 | 54 | 55 | |
| 1 | 2498 | 2012 | 1621 | 1305 |
| 2 | 68343 | 95408 | 133193 | 185942 |
| 3 | 419 | 1050 | 2633 | 6602 |
| 4 | 7979 | 6254 | 4902 | 3842 |
| 40 | 56615 | 45577 | 36691 | 29537 |
| 41 | 288387 | 436014 | 659211 | 996665 |
| 42 | 1681616 | 1780656 | 1885529 | 1996578 |
| 43 | 3071 | 3254 | 3447 | 3653 |
| Country | Week | |||
|---|---|---|---|---|
| 52 | 53 | 54 | 55 | |
| 1 | 70 | 70 | 69 | 69 |
| 2 | 1239 | 1248 | 1256 | 1264 |
| 3 | 0 | 0 | 0 | 0 |
| 4 | 228 | 228 | 229 | 229 |
| 40 | 1414 | 1410 | 1406 | 1402 |
| 41 | 3239 | 3247 | 3255 | 3262 |
| 42 | 18513 | 18533 | 18553 | 18573 |
| 43 | 39 | 39 | 39 | 40 |
This table shows the prediction of weekly accumulated confirmed cases (upper block) and deaths (lower block) from Week 52 to Week 55 for the sampled 43 countries. The prediction is based on the spectral analysis
Fig. 7.

Prediction for confirmed cases from Week 52 to Week 54. Based on the theory presented in Sect. 3, the results of weekly accumulated confirmed cases of COVID1-19 from Week 52 to Week 54 for all the 43 sampled countries are computed
Fig. 8.

Prediction for deaths from Week 52 to Week 54. Based on the theory presented in Sect. 3, the results of weekly accumulated deaths of COVID1-19 from Week 52 to Week 54 for all the 43 sampled countries are computed
Discussion and conclusion
The main purpose for this study is to extract the patterns of evolution of COVID-19 regarding confirmed cases and deaths across the globe and to predict the future trend via spectral analysis. The results are presented in Sect. 3. The characteristics of this approach in this study go as follows:
Spectral analysis is applied on detecting and tracking the hidden features of growth rates of confirmed cases and deaths of COVID-19. This method offers some advantages over statistical approaches: there is no predetermined independent variable association and no need to consider or interpret the interactions between chosen factors. This characteristic provides a much efficient way for automatic reasoning, though the features extracted are somehow mechanic.
By Manhattan metric and ranking techniques, we could then perform spectral clustering which groups the countries with similar features. Since this is a feature-based clustering, the groups could be easily identified by their representing frequencies. Unlike the classical cosine similarity analysis which is a composite and descriptive index and hard to identify their representing properties of the clusters, spectral analysis delves into the internal structures or dynamical properties of the trend.
Spectral analysis could also be applied in approximation problem and this is also conducted in the prediction of confirmed cases and deaths of COVID-19 in this study.
Based on these characteristics, there are a couple of points I like to address:
Relatively speaking, the main advantage for statistical approach, in particular regression-related methods, over spectral analysis lies in its interpretability of the causal variables. But this might also be a disadvantage since one needs to specify the independent variables, which is not a case for spectral analysis. Henceforth, for automatic reasoning, the spectral analysis shall turn out to be more effective, but for meaningful interpretation, statistical approach is preferable. The choice between them would depend on the one’s purpose.
When one is interested in finding out the fundamental features, or the benchmarks, of time series, then spectral analysis is the candidate, i.e., the internal structures of data are revealed via the magnitudes of the frequencies; cosine similarity, on the other hand, is a composite and static indicator for relation between vectors—it is much intuitive in geometrical interpretation, but ambiguous in revealing or comparing the internal structures.
As for the study, there are some points worth noticing and enhancing.
Some of the results about causal relations in this study might not comply with other researches [25]. This is reasonable, since the approach we adopt focus more on feature detection, not solely on causal relation finding.
One could also delve into the shift of phrases of the frequencies by lifting the constraint on weekly growth rates. This might yield an even more dynamical pictures of the evolutions.
The samples filtered are based on some criteria. One could loosen or strengthen the criteria to compare the results generated.
During the reviewing process of this manuscript, there is a new variant Omicron [33, 34] whose dynamical behaviour is worth further investigating [35].
Acknowledgements
This work is supported by the Humanities and Social Science Research Planning Fund Project under the Ministry of Education of China (No. 20XJAGAT001).
Appendix
Refined raw data
See Table 6.
Deaths from Week 15 to Week 51
See Fig. 6.
Weekly growth rates for cases and deaths
See Table 7.
Fourier coefficients
See Table 8.
Ranking the Fourier coefficients
See Table 9.
Prediction from Week 52 to Week 54
Data Availability Statement
All data used in this article are included in the manuscript.
References
- 1.Worldometer, COVID-19 CORONAVIRUS PANDEMIC (2021), https://www.worldometers.info/coronavirus/. Accessed 28 June 2021
- 2.Mahase E. Covid-19: What have we learnt about the new variant in the UK? BMJ. 2020;2020:371. doi: 10.1136/bmj.m4944. [DOI] [PubMed] [Google Scholar]
- 3.A.M. Pollock, Asymptomatic transmission of covid-19. BMJ 2020, 371 (2020). 10.1136/bmj.m4851
- 4.Priyadarshini I, Mohanty P, Kumar R, Son LH, Chau HTM, Nhu V-H, Thi Ngo PT, Tien Bui D. Analysis of outbreak and global impacts of the COVID-19. Healthcare. 2020;8:148. doi: 10.3390/healthcare8020148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Asch DA, Sheils NE, Islam MN, et al. Variation in US hospital mortality rates for patients admitted with COVID-19 during the first 6 months of the pandemic. JAMA Intern. Med. 2020;181:471–478. doi: 10.1001/jamainternmed.2020.8193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Boudourakis L, Uppal A. Decreased COVID-19 mortality—A cause for optimism. JAMA Inter. Med. 2020;181:478–479. doi: 10.1001/jamainternmed.2020.8438. [DOI] [PubMed] [Google Scholar]
- 7.R.-M. Chen, Randomness for nucleotide sequences of SARS-CoV-2 and its related subfamilies. Comput. Math. Methods Med. 2020, Article ID 8819942, 8 pages (2020) 10.1155/2020/8819942 [DOI] [PMC free article] [PubMed]
- 8.Chen R-M. Quantifying collective intelligence and behaviours of SARS-CoV-2 via environmental resources from virus’ perspectives. Environ. Res. 2021 doi: 10.1016/j.envres.2021.111278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Chen R-M. Track the dynamical features for mutant variants of COVID-19 in the UK. Math. Biosci. Eng. 2021;18(4):4572–4585. doi: 10.3934/mbe.2021232. [DOI] [PubMed] [Google Scholar]
- 10.Chen R-M. On COVID-19 country containment metrics: a new approach. J. Decis. Syst. 2021 doi: 10.1080/12460125.2021.1886625. [DOI] [Google Scholar]
- 11.Baden LR, El Sahly HM, Essink B, et al. Efficacy and safety of the mRNA-1273 SARS-CoV-2 Vaccine. N. Engl. J. Med. 2021;384:403–416. doi: 10.1056/NEJMoa2035389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Polack FP, Thomas SJ, Kitchin N, et al. Safety and efficacy of the BNT162b2 mRNA COVID-19 Vaccine. N. Engl. J. Med. 2020;383:2603–2615. doi: 10.1056/NEJMoa2034577. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Gowrisankar A, Rondoni L, Banerjee S. Can India develop herd immunity against COVID-19? Eur. Phys. J. Plus. 2020;135:526. doi: 10.1140/epjp/s13360-020-00531-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Easwaramoorthy D, Gowrisankar A, Manimaran A, et al. An exploration of fractal-based prognostic model and comparative analysis for second wave of COVID-19 diffusion. Nonlinear Dyn. 2021;106:1375–1395. doi: 10.1007/s11071-021-06865-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Kavitha C, Gowrisankar A, Banerjee S. The second and third waves in India: when will the pandemic be culminated? Eur. Phys. J. Plus. 2021;136:596. doi: 10.1140/epjp/s13360-021-01586-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Chu J. A statistical analysis of the novel coronavirus (COVID-19) in Italy and Spain. PLoS ONE. 2021;16(3):e0249037. doi: 10.1371/journal.pone.0249037. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Ison D. Statistical procedures for evaluating trends in coronavirus disease-19 cases in the United States. Int. J. Health Sci. 2020;14(5):23–31. [PMC free article] [PubMed] [Google Scholar]
- 18.Muthusami R, Saritha K. Statistical analysis and visualization of the potential cases of pandemic coronavirus. Virusdisease. 2020;31(2):204–208. doi: 10.1007/s13337-020-00610-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Baker TB, Smith SS, Bolt DM, Loh WY, Mermelstein R, Fiore MC, Piper ME, Collins LM. Implementing clinical research using factorial designs: a primer. Behav. Ther. 2017;48(4):567–580. doi: 10.1016/j.beth.2016.12.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.S. Huang, J. Yang, S. Fong, Q. Zhao, Artificial intelligence in the diagnosis of COVID-19: challenges and perspectives. Int. J. Biol. Sci. 17(6):1581-1587 (2021). 10.7150/ijbs.58855. Available from https://www.ijbs.com/v17p1581.htm [DOI] [PMC free article] [PubMed]
- 21.R. Vaishya, M. Javaid, I. H. Khan, A. Haleem, Artificial Intelligence (AI) applications for COVID-19 pandemic. Diabetes Metabol. Syndr.: Clin. Res. Rev. 14(4):337–339 (2020). ISSN 1871-4021, 10.1016/j.dsx.2020.04.012 [DOI] [PMC free article] [PubMed]
- 22.Yi PH, Kim TK, Lin CT. Generalizability of deep learning tuberculosis classifier to COVID-19 chest radiographs: new tricks for an old algorithm? J. Thorac. Imaging. 2020;35(4):W102–W104. doi: 10.1097/RTI.0000000000000532. [DOI] [PubMed] [Google Scholar]
- 23.Yousefzadeh M, et al. ai-corona: Radiologist-assistant deep learning framework for COVID-19 diagnosis in chest CT scans. PLoS ONE. 2021;16(5):e0250952. doi: 10.1371/journal.pone.0250952. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Takefuji Y. Fourier analysis using the number of COVID-19 daily deaths in the US. Epidemiol. Infect. 2021;149:E64. doi: 10.1017/S0950268821000522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Valev D. Relationships of total COVID-19 cases and deaths with ten demographic, economic and social indicators. medRxiv. 2020 doi: 10.1101/2020.09.05.20188953. [DOI] [Google Scholar]
- 26.Roman S. Advanced Linear Algebra. 3. Berlin: Springer; 2007. [Google Scholar]
- 27.A.L. Schoenstadt, An Introduction to Fourier Analysis. https://www.math.bgu.ac.il/~leonid/ode_9171_files/Schoenstadt_Fourier_PDE.pdf. Accessed 29 Dec 2020
- 28.The Humanitarian Data Exchange, Geographic Distribution of COVID-19 Worldwide, https://data.humdata.org/dataset
- 29.P.E. Black, Manhattan distance, in Dictionary of Algorithms and Data Structures [online], ed. by P.E. Black, 11 February 2019. Available from: https://www.nist.gov/dads/HTML/manhattanDistance.html. Accessed 20 May 2021
- 30.A. Tandon, C.J.L. Murray, J.A. Lauer, D.B. Evans, Measuring overall health system performance for 191 countries, GPE Discussion Paper Series: No. 30, https://www.who.int/healthinfo/paper30.pdf
- 31.A.-B. Vincent, N. Enevoldsen, C.J. Yetman, countrycode: An R package to convert country names and country codes. J. Open Source Softw. 3(28):848 (2018). 10.21105/joss.00848 [DOI]
- 32.Chen RM. Economic categorizing based on DFT-induced supervised learning. Comput. Econ. 2020 doi: 10.1007/s10614-020-10076-4. [DOI] [Google Scholar]
- 33.Gowrisankar A, Priyanka TMC, Banerjee S. Omicron: a mysterious variant of concern. Eur. Phys. J. Plus. 2022;137:100. doi: 10.1140/epjp/s13360-021-02321-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Christie B. Covid-19: Early studies give hope omicron is milder than other variants. BMJ. 2021;375:3144. doi: 10.1136/bmj.n3144. [DOI] [PubMed] [Google Scholar]
- 35.Khajanchi S, Sarkar K, Banerjee S. Modeling the dynamics of COVID-19 pandemic with implementation of intervention strategies. Eur. Phys. J. Plus. 2022;137:129. doi: 10.1140/epjp/s13360-022-02347-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
All data used in this article are included in the manuscript.




