Skip to main content
Medicine logoLink to Medicine
. 2020 Jun 12;99(24):e20774. doi: 10.1097/MD.0000000000020774

Geographic risk assessment of COVID-19 transmission using recent data

An observational study

Tung-Hui Jen a,b, Tsair-Wei Chien c, Yu-Tsen Yeh d, Jui-Chung John Lin e, Shu-Chun Kuo f,g, Willy Chou h,i,
Editor: Oliver Schildgen
PMCID: PMC7302653  PMID: 32541529

Supplemental Digital Content is available in the text

Keywords: case fatality rate, geographic risk, novel coronavirus-19, outbreak magnitudes, Rasch model

Abstract

Background:

The US Centers for Disease Control and Prevention (CDC) regularly issues “travel health notices” that address disease outbreaks of novel coronavirus disease (COVID)-19 in destinations worldwide. The notices are classified into 3 levels based on the risk posed by the outbreak and what precautions should be in place to prevent spreading. What objectively observed criteria of these COVID-19 situations are required for classification and visualization? This study aimed to visualize the epidemic outbreak and the provisional case fatality rate (CFR) using the Rasch model and Bayes's theorem and developed an algorithm that classifies countries/regions into categories that are then shown on Google Maps.

Methods:

We downloaded daily COVID-19 outbreak numbers for countries/regions from the GitHub website, which contains information on confirmed cases in more than 30 Chinese locations and other countries/regions. The Rasch model was used to estimate the epidemic outbreak for each country/region using data from recent days. All responses were transformed by using the logarithm function. The Bayes's base CFRs were computed for each region. The geographic risk of transmission of the COVID-19 epidemic was thus determined using both magnitudes (i.e., Rasch scores and CFRs) for each country.

Results:

The top 7 countries were Iran, South Korea, Italy, Germany, Spain, China (Hubei), and France, with values of {4.53, 3.47, 3.18, 1.65, 1.34 1.13, 1.06} and {13.69%, 0.91%, 47.71%, 0.23%, 24.44%, 3.56%, and 16.22%} for the outbreak magnitudes and CFRs, respectively. The results were consistent with the US CDC travel advisories of warning level 3 in China, Iran, and most European countries and of level 2 in South Korea on March 16, 2020.

Conclusion:

We created an online algorithm that used the CFRs to display the geographic risks to understand COVID-19 transmission. The app was developed to display which countries had higher travel risks and aid with the understanding of the outbreak situation.


Key Points

  • Using Bayes's theorem to verify the risk denoted by case fatality rate in an individual region. The shared portions in deaths and recoveries can be used to more accurately assess the probability of (P(A1), case fatality rate (CFR)) than can be done without the knowledge of the shares using Bayes’ theorem to estimation.

  • Suggesting the 2 modes based on confirmed cases and CFRs that are combined with the doubling days for the confirmed cases on COVID-19, which is never discussed in the literature.

  • An app developed for displaying the provisional CFR and Rasch analysis online to modify the traditional dashboards without a particular mathematical algorithm in an app.

1. Introduction

Since the outbreak of the 2019 novel coronavirus disease (COVID-19) in Wuhan city, China, on January 30, 2020,[1,2] a total of 182,185 confirmed cases and 7148 deaths had been reported by March 16, 2020,[3] involving 31 provinces/cities in China as well as 162 countries/regions outside of China.[4] The total number of deaths (=7148) has substantially surpassed those from (final toll of 774 deaths in 2003) and the Middle East respiratory syndrome (final toll of 858 deaths in 2012).[57]

1.1. Travel information required for knowledge of COVID-19 risk

In an influenza pandemic, the strength of the increase in confirmed cases is a proxy for epidemic size and disease transmissibility.[8] The US Centers for Disease Control and Prevention (CDC) has established geographic risk-stratification criteria for the purpose of issuing travel health notices for countries with COVID-19 risk and guiding management decisions for people with potential travel-related exposure to COVID-19.[9]

Four strata have been established:

  • (1)

    limited community transmission,

  • (2)

    sustained (ongoing) community transmission,

  • (3)

    widespread, sustained (ongoing) transmission, and

  • (4)

    widespread, sustained (ongoing) transmission and restrictions on entry to the United States. For instance, on March 16, 2020, the entry of foreign nationals from China and Iran was suspended.

The CDC recommended that

  • (1)

    travelers avoid all nonessential travel to the following destinations (China, Iran, and most European countries), and

  • (2)

    older adults or those with chronic medical conditions consider postponing traveling to South Korea.

These represent the 3 levels of notice based on the risk presented by the outbreak and the precautions that are needed to prevent infection, including watch level 1, alert level 2, and warning level 3.

Although a number of factors were involved in publishing the geographic risk stratification, including size (e.g., the number of confirmed cases), geographic distribution, and epidemiology of the outbreak,[8] none of these objectively observed criteria were provided to us for our assessment of the COVID-19 situation for each country/region.

1.2. Risk assessment on an app

As of February 29, 2019, more than 377 articles related to COVID-19 were searchable with the keyword “covid-19 or 2019-nCoV” on PubMed Central (PMC).[10] The Johns Hopkins Center for Systems Science and Engineering (JHC) has built an online dashboard and regularly updates the data to track the worldwide spread of the 2019-nCoV outbreak[3] with the hope of providing the public with a better understanding of the COVID-19 outbreak. However, the JHC[3] and other dashboards[4,11,12] only provided visual dashboards of the world map and included little information on the outbreak and bubbles for counties/regions. No solid geographic risk assessment for COVID-19 transmission has been seen yet on the internet, including on those websites[3,4,1318] providing simple and widely available information (e.g., the number of confirmed, deaths, and recovered cases based on countries/regions along with death rate, transmission rate, incubation period, as well as discussions on age and demographics) to the public. None were found to be equipped with travel information that would fulfill the public's needs.

1.3. The risks assessed by using the Rasch model

Rasch models,[19] which were named after Georg Rasch, are a family of psychometric models for creating measurements from categorical data, such as answers to questions on a reading assessment or questionnaire responses with a function of the trade-off between

  • a.

    respondent ability and

  • b.

    task difficulty.[20]

In addition to psychometrics and educational research, the Rasch model and its extensions have been used in other areas, including the health profession[21] and market research,[22] because of their general applicability.[23]

Our goal was to determine whether Rasch analysis could be used for inspecting epidemic magnitudes by observing the pattern of daily confirmed cases. The reasons for the use of the Rash model include that

  • 1.

    all responses were ordinal within a specific range (e.g., from 0–5 on a Likert-type scaling survey),

  • 2.

    all regions and days (like persons and items on a test) were on an equal interval continuum with a unit of logit (=log odds) in comparison,[21,24]

  • 3.

    sequential assessments that estimate the epidemic magnitudes and examine the COVID-19 situation for each country/region instead of using the cumulative confirmed cases with the traditional method ignoring the recent cases, which have greater weight (i.e., of importance) in determining the outbreak magnitudes.

1.4. Geographic risk assessment of case fatality rates

The (CFR is related to the following questions:

  • (1)

    How deadly is this? and

  • (2)

    how many people will die in this outbreak? The severe acute respiratory syndrome , the Middle East respiratory syndrome, Ebola, and H1NI yielded real CFRs of 9.6%, 34.4%, 73%, and 0.4%, respectively,[57] and the CFR for COVID-19 has been discussed in numerous articles.[2527]

The World Health Organization, in a press conference on January 29, 2020, announced that the death rate of COVID-19 was 2% based on the CFR calculation (= deaths/cases).[4,2831] This figure was substantially underestimated because it assumed

  • 1.

    no lag days from symptom onset to death (i.e., death tolls registered and confirmed many days ago)[27] and

  • 2.

    all currently infected cases had totally (i.e., 100%) recovered.

Bayes's theorem (alternatively Bayes's law or Bayes's rule) describes the probability of an event based on prior knowledge of conditions that might be related to the event.[32] It is necessary to use the post-CFR to adjust the prior-CFR for each country/region on COVID-19 to examine the geographic risks. This is because the post-CFR might be increased if the conditional probability of death is greater than the counterpart of recoveries according to the equation, Inline graphic, where the probability of (P(A1), CFR) is based on the shared portions of (1) conditional deaths and recoveries: P(B|A1) and P(B|A2), and (2) the total possibility (e.g., Inline graphic for a particular region, P(A1) = 1 − CFR). The shared portions can be used to more accurately assess the probability of (P(A1), CFR), which can be done without the knowledge of the shares using Bayes's theorem for estimation.

In the current study, we were motivated to apply Bayes's theorem to estimate the adjusted CFR for countries/regions on COVID-19.

1.5. The aims of this study

The aims of the current study were to

  • 1.

    visualize (i) the outbreak magnitude and (ii) the adjusted CFRs for countries/regions in recent days

  • 2.

    develop an algorithm that classifies countries/regions into categories of outbreak epidemics and shows then on Google Maps, and

  • 3.

    design an app for better interpreting the geographic risk of COVID-19 transmission.

2. Methods

2.1. Data source

We downloaded COVID-19 outbreak numbers on March 16, 2020, from GitHub,[13] a site that provides information on newly confirmed cases in more than 31 Chinese locations and other countries/regions. All downloaded data (in Supplemental Digital Content file 1) were publicly displayed on the website. Ethical approval was not necessary for this study because all the data were obtained via the internet.[13]

2.2. Rasch model for obtaining the outbreak magnitudes

The Rasch analysis[33] was performed online using author-developed codes.[34] All responses were derived from ordinal scores using the logarithm functions (i.e., using the Excel function round (LN(confirmed cases),0) from 0 to 5) for each region in China and other countries. The geographic risks for COVID-19 transmission were determined by both the outbreak magnitudes with a unit of logit (log odds) and the adjusted CFRs based on Bayes's theorem.

2.3. Bayes's theorem for producing the adjusted CFRs

We defined the adjusted post-CFR, as shown in Eqs. (1) and (2) as follows:     

2.3.
2.3.
2.3.
2.3.

where P(A1|B) denotes the post-CFR, P(B) stands for the burnouts (or loading dealing with those currently infected cases in the respective region) on COVID-19, and P(B|Ai) represents the conditional probabilities observed from the structure (or pattern) in deaths (=A1) and recoveries (=A2). P(A1) and P(A2) are the prior-CFR (=deaths/confirmed cases) and the probability of recoveries (=1-CFR), respectively; in (3) and (4), the adjusted post-CFR is higher if P(B|A1) is greater than P(B|A2). Otherwise, the post-CFR is less than the prior = CFR. As such, the transmission risk can be denoted by the adjusted post-CFR because these two metrics in Eqs. (3) and (4) are unequal.

Imagine that at the end of the outbreak course, both P(B|A1) and P(B|A2) converge to have identical values and lead both post/prior-CFRs to be equal.

2.4. World maps and the Kano diagram for displaying geographical messages

World maps have been used to show disparities in health outcomes across areas in many disciplines,[35,36] such as dengue outbreaks,[37,38] disease hotspots,[39] and the Global Health Observatory (GHO) maps on major health topics.[40]

A Kano diagram[41,42] was used to highlight the geographic risks of countries/regions. The Kano diagram was used to divide areas into three groups; bubbles were colored by latitude (i.e., higher 40 in green and below 23.5 in red) and sized by doubling days for the confirmed cases of COVID-19 (i.e., days it takes to double the number of confirmed cases starting from at least 10 cases). The formula of 1/d∗10 was applied to transform the doubling days into a scale, with higher means spending fewer days to increase the number of confirmed cases.

Rasch logit scores are on the axis X and adjusted CFRs on the axis Y. The number of confirmed cases in the recent 20 and 10 days were transformed into ordinal scores from 0 to 4, respectively, for comparison.

On the other hand, we plotted countries/regions on the Kano diagram, dividing them among four features represented by different colors:

  • 1.

    ready to increase (yellow),

  • 2.

    increasing (green),

  • 3.

    starting to decrease (light green), and

  • 4.

    decreasing (red).

A specific algorithm was applied to the categorization of the features mentioned above. Three types of line charts were provided to verify that the 4 features were fully supported.

2.5. A dashboard on Google Maps to present the trend

A dashboard app was designed for a daily updating geological display of the epidemic situation for travelers. We examine whether the Rasch model could be applied to evaluate the risk-alert level for COVID-19 by examining the advisories of the US CDC. The study flowchart is shown in Figure 1 and Supplemental Digital Content file 2.

Figure 1.

Figure 1

Study flowchart.

3. Results

3.1. Geographic risks

On March 16, 2020, we observed that the top 7 countries/regions were Iran, South Korea, Italy, Germany, Span, China (Hubei), and France, with values of {4.53, 3.47, 3.18, 1.65, 1.34 1.13, 1.06} and {13.69%, 0.91%, 47.71%, 0.23%, 24.44%, 3.56%, and 16.22%} for outbreak magnitudes and CFRs, respectively, using the last 20 days for measurement (see Fig. 2).

Figure 2.

Figure 2

Using a Kano diagram to highlight the geographic risk of COVID-19 (A, in the last 20 days).

If the last 10 days were applied to measure the geographic risks for regions, the top seven were Germany, Iran, South Korea, Italy, Spain, Sweden, and Norway, with {3.59, 3.59, 2.53, 2.23, 2,23, 1.88, and 1.99} and {0.23%, 13.69%, 0.91%, 47.71%, 24.44%, 0.54%, and 0.23%} for the Rasch scores and CFRs, respectively (see Fig. 3).

Figure 3.

Figure 3

Using the Kano diagram to highlight the geographic risk of COVID-19 (B, in the last 10 days).

Readers are invited to scan the QR codes in Figures 2 and 3 to see details about the information on Google Maps, such as the doubling days for the confirmed cases on COVID-19: 5 and 7 days for Hubei (China) and South Korea.

It is worth noting that Hubei (China) has fallen behind on the outbreak magnitudes because the outbreak situation has been gradually improved if the data from the last seven days are used for reporting.

The results were consistent with the US CDC travel advisories of warning level 3 in China, Iran, most European countries, and level 2 in South Korea on March 16, 2020.

3.2. World map on 2 issues of concern

The top 3 countries/regions (Italy, Spain, and Iran) with the highest COVID-19 transmission risks were particularly highlighted with symbols from 1 to 3 using the confirmed cases in the recent seven days dated March 16, 2020 (Fig. 5). The bubbles were sized according to the number of confirmed cases and colored by feature (i.e., ready to increase, increasing, starting to decrease, and decreasing). We can see that counties in Europe have green bubbles. In contrast, many regions (or provinces in China) have black bubbles, indicating that there has been no confirmed case in the last 7 days.

Figure 5.

Figure 5

Division of the 4 features with a Kano diagram.

We suggest that readers scan the QR-code in Figure 5 and click the link about the 3-line charts for the region of interest.

3.3. Four features of the outbreak shown on a dashboard

The 4 features of the outbreak for each country/region are shown in Figure 5. We can see that the bubbles were sized by the number of confirmed cases and colored by feature (e.g., increasing in green and decreasing in red). The line charts regarding the details appear when the bubble of interest has been clicked.

4. Discussion

4.1. Findings and implications

We confirmed that the information in Figure 2 by using Rasch analysis and the adjusted CFRs could highlight the travel risk on COVID-19. The results were consistent with the US CDC travel advisories of warning level 3 in China, Iran, and most European countries, and level 2 in South Korea on March 16, 2020.

4.2. What this finding adds to what we already know

In an influenza pandemic, the strength of the increase in confirmed cases is a proxy for epidemic size and disease transmissibility.[8] The US CDC has established geographic risk-stratification criteria for the purpose of issuing travel health notices for countries with the risk of COVID-19 transmission and guiding public health management for people with potential travel-related exposures to COVID-19.[9] However, there is no objective measurement system that can help us visualize the transmission risk of COVID-19 for travelers. In this study, we provided visual representations based on the risk posed by the outbreak using Rasch analysis and the CFRs based on Bayes’ theorem, which was a rare strategy in the literature.

Many dashboards and websites[3,4,1318] provide daily COVID-19-related information. None of them display such sophisticated messages on the ongoing epidemic situations as those from the Rasch modeling technique and the Bayes’ theorem (Figs. 25).

Although choropleth maps have been popularly applied in the healthcare setting,[35,36] the 2 major features of outbreak magnitudes and CFRs are included in this study to display the high travel risk for COVID-19 transmission, which differentiates this study from others[3,4,1318,43] that only provide the number of confirmed cases or other simple information, particularly with bubbles sized by the number of confirmed cases and merely colored without other meaningful features.

4.3. What it implies and what should be changed

We provide 2 main algorithms that display the outbreak magnitudes and CFRs to highlight the regions with the highest transmission risk, which are rarely seen in the literature but are of importance to revealing the epidemic transmission risk. However, with complex computations, these 2 algorithms can be routinely run on the internet, which allows us to easily examine the daily progress of the outbreak, as we have shown in the previous figures. QR codes have been provided to readers to examine the detailed information on any regions of interest on the dashboards via Google Maps.

The post-CFRs were used to examine how the particular risks appeared in regions. In this case, the 7 countries/regions were within our expectations and were listed on the US CDC website on March 16, 2020,[9] indicating that the results were reliable.

4.4. Strengths of this study

Two main strengths of the current study include

  • 1.

    the epidemic trend displayed under the Rasch measurement (X-axes in Figs. 2 and 3);

  • 2.

    CFRs based on Bayes’ theorem, which was enriched in this study (Y axes in Figs. 2 and 3);

  • 3.

    the geographic risks shown on Google Maps (Fig. 4);

  • 4.

    using 4 features to display all countries/regions in four respective quadrants (Fig. 5); and

  • 5.

    the creation of an app to demonstrate the COVID-10 situations on dashboards that use Google Maps for display.

Figure 4.

Figure 4

The top 3 regions with the highest outbreak magnitudes. Note: black bubbles indicate a trend toward stationarity.

4.5. Limitations and future studies

Our study has some limitations. First, we were more concerned with the transmission risk in certain regions. As such, the numbers of confirmed cases were transformed into ordinal scores (e.g., from 0 to 5) to fit the Rasch model's requirement. Whether the preliminary assumptions on the Rasch model were met (e.g., local independence on items and unidimensional scale) was not examined in this study, though Rasch analysis can be performed on such repeated measures.[4446]

Second, although we applied CFRs to distinguish the geographic risks, the difference between the prior- and post-CFRs might emphasize the regions with higher risks based on death tolls. In contrast, the Rasch logit scores were focused on the outbreak magnitudes. A greater number of confirmed cases yield higher magnitudes due to momentum.

Third, readers might be doubtful about the different weights, which were created by transforming original counts into ordinal scores using the logarithm function, used in the Rasch analysis. Areas with more confirmed cases have lower weights, similar to the law of diminishing marginal utility in economics.[47] Otherwise, the transformation function can be substituted with other functions, such as equal interval compression (e.g., compress cases/1000 into several categories), to meet the requirement of Rasch measurement.

Fourth, the doubling days for the confirmed cases on COVID-19 have not been discussed much in this study. The use of doubling days in estimating the number of confirmed cases in a region is worth studying in the future. For instance, when the doubling days and the average length of hospitalization for deaths (ALHD) are known, the confirmed cases can be estimated by the formula of 2^(ALHD/DD) ∗ death tolls in a region.

Furthermore, the online Rasch rating scale model[33,34] was programmed by the authors. Although many visualization models have been developed, other useful diagrams and algorithms, such as diagnosis maps and KIDMAP,[48,49] can be further elaborated and developed in the future.

Finally, we suggest using both outbreak magnitudes and CFRs to observe the transmission risk in regions. The former concerns the number of confirmed cases, and the latter relates to the death tolls. From these 2 perspectives, we can understand the transmission risks with more confidence, making them worthy of further investigation in the future.

5. Conclusion

We created an online Rasch modeling algorithm to display a visual representation of the geographic risks of the COVID-19 transmission. We are hopeful that the app will help us better understand travel risks and keep us updated on the situation of the current outbreak.

Author contributions

TWC developed the study concept and design. SC, JCJ, and YT analyzed and interpreted the data. SC monitored the process of this study and helped in responding to the reviewers’ advice and comments. TH drafted the manuscript, and all authors provided critical revisions for important intellectual content. The study was supervised by WC. All authors read and approved the final manuscript.

Acknowledgments

We thank AJE (American Journal Experts at https://www.aje.com/) for the English language review of this manuscript.

Author contributions

Conceptualization: Tsair-Wei Chien.

Data curation: Tsair-Wei Chien.

Formal analysis: Tung-Hui Jen, Jui-Chung John Lin.

Methodology: Yu-Tsen Yeh, Shu-Chun Kuo,.

Resources: Tung-Hui Jen.

Software: Tsair-Wei Chien.

Supervision: Willy Chou.

Validation: Tung-Hui Jen, Willy Chou

Writing – original draft: Tsair-Wei Chien.

Writing – review & editing: Willy Chou.

Tsair-Wei Chien orcid: 0000-0003-1329-0679.

Supplementary Material

Supplemental Digital Content
medi-99-e20774-s001.docx (12.7KB, docx)

Supplementary Material

Supplemental Digital Content
medi-99-e20774-s002.xlsx (100.3KB, xlsx)

Footnotes

Abbreviations: CFR = case fatality rate, COVID = novel coronavirus disease .

How to cite this article: Jen TH, Chien TW, Yeh YT, Lin JC, Kuo SC, Chou W. Geographic risk assessment of COVID-19 transmission using recent data: an observational study. Medicine. 2020;99:24(e20774).

All data were downloaded from the Google Sheet.

Supplemental Digital Content is available for this article.

The authors have no funding and conflicts of interest to disclose.

The datasets generated during and/or analyzed during the current study are publicly available.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Digital Content
medi-99-e20774-s001.docx (12.7KB, docx)
Supplemental Digital Content
medi-99-e20774-s002.xlsx (100.3KB, xlsx)

Articles from Medicine are provided here courtesy of Wolters Kluwer Health

RESOURCES