An official website of the United States government
Here's how you know
Official websites use .gov
A
.gov website belongs to an official
government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you've safely
connected to the .gov website. Share sensitive
information only on official, secure websites.
As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with,
the contents by NLM or the National Institutes of Health.
Learn more:
PMC Disclaimer
|
PMC Copyright Notice
Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
Within half a year, COVID-19 spreads to most countries in the world, as well as posed a great threat to the public health of human beings. The implementation of non-pharmaceutical intervention (NPI), including travel ban, proved to be an effective way for controlling the epidemic spreading, e.g., the ban of inter-city transportation stops transporting virus through passengers between cities. However, travel ban could significantly impact many industries, e.g. tourism and logistics, thus jeopardizing the regional economy. This paper focus on assisting the national or regional government to make dynamic decisions on restricting and recovering intercity multi-modal travel services. Our model can characterize impacts of inter-city traffic on the spread of the COVID-19, as well as on the regional economy. By applying a reinforcement learning approach, we develop an online optimization model to identify the modal-specific travel banning strategy that can balance the epidemic control as well as the negative impacts on regional economy. The numerical study based on a network of multiple cities in China shows that the proposed approach can generate better strategies compared with some existing methods.
In 2020, the outbreak of COVID-19 became a disaster for all human. The outbreak was first detected in December 2019. Soon, the WHO began to declare the outbreak as a Public Health Emergency of International Concern (PHEIC) on January 30. By the end of July, nearly 15 million people were infected and over 500 thousand people died worldwide. More than 4 million people have been infected in the United States and 150 thousand have died. In the early stage, the Non-pharmacological interventions (NPI), including travel ban, social distancing, school closure and emergency response, seem to be the only effective way to prevent and slow down the pandemic of diseases when the vaccine is under research. For instance, public transportation increases the risk of person-to-person contact and aggravates the spread of the epidemic across cities or countries. Travel restrictions are helpful to control the risk of epidemic spreading. Wuhan adopted the travel ban on Jan.23th right after the outbreak before the Chinese New Year. Wuhan restricted the intercity traffic and closed the airports, railway stations and the highway. The travel ban had been effective until April 8, two weeks after the declaration of indigenous pandemic interruption. Many studies suggested that Wuhan's travel restrictions had a positive effect on controlling the spread of COVID-19. Some studies indicated that the travel quarantine of Wuhan delayed the overall epidemic progression in mainland China (Chinazzi et al., 2020; Tian et al., 2020). Leung et al. (2020) found the reproduction number, denoted by , decreased substantially in all selected cities and provinces since control measures were implemented, and have since remained below 1, which indicates the epidemic has been controlled. With travel restriction (no imported exposed individuals to Beijing), the number of infected individuals in seven days will decrease by 91.14% in Beijing, compared with the scenario of no travel restriction (Tang et al., 2020). With the pandemic of COVID-19 around the world, many other countries also adopted the travel ban or traffic restriction to protect people from COVID-19. For example, Australians have followed the urgings of National Cabinet and all levels of government, to limit travel and social contact, which has thus far resulted in a “flattening of the curve” (Beck and Hensher, 2020). At the same time, almost all the airline companies reduced flights, and a larger reduction in air travel through airports in a large part of the cumulative incidence area would lead to a gradual decrease in the risk flow (Nakamura and Managi, 2020). In addition to governance decisions, individual actions are also important. But individuals do not internalize the external cost of infection risks they impose on others and the health care system when making their own travel (social-activity) decisions; In order to induce individual travel decision-makers to internalize this external cost, the government actions are necessary (Oum and Wang, 2020).
However, with the achievement of epidemic control, the whole society is paying a big price for various NPIs. The complex dynamic social-economic system needs mobility, and a long time travel ban haves direct and indirect impacts on many businesses, such as tourism, restaurants and cinemas, leading to more unemployment. The evidence is given by some latest studies. For example, Guan et al. (2020) shows that COVID-19 control measures affected the global supply-chain and caused losses and the complexity of global supply chains magnified losses beyond the direct effects of COVID-19. Also, other studies show differences in strictness of such measures and the rapidity with which governments have imposed and relaxed the measures have a divergent impact on public health and economic losses (Wells et al., 2020; Inoue and Todo, 2019). In fact, many cities and organizations around the world are calling for hope to resume transportation and production as soon as possible. It is a critical decision for government to figure out when to recovery traffic to restart the intercity mobility. Therefore, it is important to understand the dynamics of the spreading of epidemic, in particular to estimate the risk of transmission over time, and analyze the effect of the control policies. This paper focuses on the evaluation of travel ban policies with different levels of strictness considering the balance between epidemic control and economic development. A network-based epidemic spreading model is established, in which each node represents a city. The link between nodes represents the connection between cities, and different travel modes as well as mode choice process are considered. The paper proposes a reinforcement-learning (RL)-based multi-modal decision-making framework under uncertainty. The RL-based framework tends to find a policy that helps to make a balance between public health risk prevention and economic development. The paper also evaluates various potential epidemic control strategies from the public health intervention side, and make a comparison between RL-based approach and some traditional control policies.
The rest of the paper is organized as follows. Section 2 elaborates on the framework of the methodology, the network formulation, and the epidemic transmission model. Section 3 introduces the RL-based decision-making framework and its execution procedure. Section 4 presents a series of case studies, including three virtual city networks with different sizes, and a real city network containing 15 cities, and the experiments illustrate epidemic spreading across the cities and the effect of different travel ban policies. In Section 5, we conclude the paper and offer some policy implications.
2. Methodology
2.1. City network modeling
2.1.1. Network setting
We adopt a directed graph to represent the network of cities, in which is the finite set of nodes and is the set of directed arcs, where stands for the travel mode. For a pair of nodes n and , we have if and only if there is a link connecting these two nodes such that the people can move directly from city n to city by traffic mode m. For each node , we can find a set of its successor nodes , i.e. .
2.1.2. Node representation
Each node stands for a city, and every node has a label, , which owns 3 attributes. . represents the location of city n. is the compartment set, i.e., , and the sum of elements in equals to the population of city n. are the set of parameters in epidemic model, i.e., , and the parameters will be explained in section 2.2.
2.1.3. Link representation
The link l in network stands for a kind of travel mode from one city to another. Each link has 4 attributes, i.e., . is the distance between city n and city . are the travel time, travel cost and capacity from city n to city by mode m, respectively.
2.2. Epidemic modeling
The modeling of epidemic spreading is a tool that has been used to study the mechanisms by which diseases spread, to predict the future course of an outbreak, and to evaluate strategies to control an epidemic. The most popular epidemic model is the compartment model (Kermack and McKendrick, 1927), in which the population is assigned to different compartments with labels. People can transfer between compartments. The order of the labels usually shows the flow patterns between the compartments. The setting of labels is different for different diseases, common labels include S, I, and R (Susceptible, Infectious, Recovered). In the study of COVID-19, the susceptible-exposed-infectious-recovered (SEIR) model (Aron and Schwartz, 1984), shown in Fig. 1
, is one of the most adopted methods. The population is divided into four different compartments, susceptible (denoted by S, the people who are able to contract the disease), exposed (E, the people who have been infected by the disease but not yet infectious), infectious (I, the people who are capable of transmitting the disease), and removed (R, the people who have recovered with immunization or died). And the people progress between those four compartments. SEIR model and its improved version have widely used in COVID-19 related research (Kucharski et al., 2020; Wu et al., 2020; Chen et al., 2020). In this paper, an improved SEIR model is coupled into the network model as one integrated part. When considering the travel of infected people between cities, we also consider the risk of people being infected in the city and on transportation. However, the infection characteristics in cities and transportation are still slightly different, and we adopt a distinction in the model.
To be more in line with the actual situation, we improve the base model and develop an extended SEIHRD model, since in practice we can only find infected persons by nucleic acid testing (NAT), and there are always some patients who are not effectively identified. The new compartment H represents patients who have been admitted to the hospital. And compartment I represents the unreported infected person. Compartment R stands for the recovered person and compartment D stands for the dead. We distinguish between compartment I and compartment H, because those who have been found infected will be sent to the hospital, and they will not move to other cities, let alone infect others. Undetected infections, including hidden infections and undetected overt infections, will become the source of infection. At the same time, the cure rate and death rate are related to whether the infected person is hospitalized or not.
Fig. 2 shows how people progress each compartment in SEIHRD model. The rate of spreading is associated with the probability of transmitting disease between the susceptible people and the exposed people , which is controlled by the infectious rate β. Besides, the exposed people are also infectious in case of COVID-19, so we use and to represent the infectious rate of exposed people and infected people, respectively. The rate of exposed people becoming infected person and hospitalized patients is defined as the incubation rate, and . At last, the cure rate and mortality rate of untreated patients and hospitalized patients are different, denoted by . Because the period for the pandemic of COVID-19 control we focus on is only several months, the effect of birth rate or death rate on the population is ignored, and the change of population in each city is only affected by the travel between cities. represent the susceptible people who move from city i to city j by mode m. Eqs. (1), (2), (3), (4), (5), (6) mathematically depict the entire progress dynamics.
The spreading of COVID-19 not only occurs in cities. Some evidence shows that COVID-19 is also very easy to spread within public transportation, and the risk of spreading is related to the capacity and the crowdedness. As a result, the SEIHRD diagram we apply in our city network can be divided into two categories, node-based SEIHRD, and link-based SEIHRD. The node-based model is adopted to describe the epidemic situation inside the city, while the link-based model is used to describe the risk of the epidemic spreading in public transportation, like the airplane, HST, coach and private car. Although there is no hospital on buses, considering that the infected person with severe symptoms will be sent for treatment after getting off the buses, we also consider the H part.
2.2.1. Time-varying infection rate
To control the pandemic of epidemic, the government implemented a series of public health intervention policies, i.e. emergency response and social distancing, which is an important factor affecting the spreading of epidemic. However, the strength of these measures is difficult to be accurately quantified. We used a truncation function to describe the change of infectious rate β. is the time for the government to initiate the first-level response. Before taking any prevention policies, β keeps as a constant. After that, β decays exponentially with time, approaching zero. This setting is in accordance with some relevant studies on SARS (Ni, 2009). With the assumption of the monotonically decreasing infection rate, one outbreak or independent outbreaks of the pandemic is considered in the epidemic model. For the scenario with independent outbreaks, we can apply the model at the beginning of each outbreak.
(7)
where β is the initial infectious rate, ξ are the parameters controlling the exponential decay speed.
2.3. Travel demand modeling
For the sake of simplification, we presently assume that the travel demand between cities follows the gravity model. The intensity of travel demand between the two cities is proportional to the population of the two cities and inversely proportional to the square of the distance. Also, we have a random number , which follows a normal distribution, to model the randomness. A city will have a certain percentage of people leaving the city, which is a random number. The distribution of travel demand is related to the population of destinations. We denote as the travel demand from city i to city j in t day, and it is calculated by Eq. (8):
(8)
where is the population of city i and city j, respectively. is the distance between city i and city j. G is a scaling parameter.
2.4. Travel ban action
As one of the primary method to control the epidemic pandemic, travel ban is adopted by many governments. Define binary variables such that if link l under mode m is open at time t, and otherwise. We set two scenarios to perform travel ban.
2.4.1. Scenario I
In scenario I, the government tends to perform the policies when they are needed. In other words, the government will decide both when to perform travel ban policies and when to cancel the policies as well as recover the traffic. At the same time, the opening and closing of the travel ban can only be performed once in the entire period. All links are assumed to be open at the beginning. Eqs. (9), (10), (11) shows the travel ban under Scenario I.
(9)
(10)
(11)
2.4.2. Scenario II
In some cases, the outbreak of COVID-19 is a very urgent event, and the government should take all the available policies as soon as possible. In Scenario II, we assume all the travel is banned as soon as possible, and the government only needs to decide whether to cancel the policies and recover travel. Eqs.(12), (13), (14) shows the travel ban under Scenario II.
(12)
(13)
(14)
2.5. Mode choice and travel demand transfer
We consider multi-modal transportation between cities. Because of the travel ban, the compensation phenomenon between transportation modes is considered in our model. If a transportation mode stops operating, the traffic originally allocated to that mode will be transferred to other modes. Taking high-speed train (HST) and aircraft as an example, if the aircraft is banned, a large part of passengers will change their choice to travel on HST. The flow between two cities is associated with a vector to characterize the distribution in different modes, where .
Furthermore, each mode of transportation has the capacity limit. If the passengers who choose certain transportation exceed the capacity limit, their needs will not be met. We adopt an improved multinomial logit model (MNL) to describe the choice of travelers. In the traditional MNL model, travelers are characterized by their utility . Assuming that has a deterministic part in terms of the travel cost and the travel time , and a random error distributed with a Gumbel distribution, it is possible to know that the mode choice probability follows a multinomial logit such that:
(15)
(16)
where ρ are the travelers' average time value, where we use social average hourly income. Then the mode choice probability, , satisfy:
(17)
Denote as the mode specify flow. Considering the capacity constrain, that is, , we propose a multiple traffic distribution method. It can also be treated as a dynamic ticketing process. Every passenger chooses a mode with the probability . If the number of passengers choosing a certain mode reaches its capacity, the extra passengers can only choose from other modes that have not reached the capacity. At last, all travelers have successfully chosen a mode, or all modes have no remaining capacity. The mode choice is shown in Algorithm 1.
The implementation of the Non-pharmaceutical Intervention (NPI) has a side effect on business and the economic system. The tool should be of interest to policymakers and others who wish to use it to understand the potential impacts of the travel ban on health and economic outcomes in their communities. Then they weigh the trade-offs between various policies and decide when and how these interventions can be relaxed. Travel ban helps to prevent the spreading of epidemic, but it also hinders economic development.
Our society and economic system are so complex that it is hard to directly calculate the system loss. So we use some approximate methods. Mobility, represented by traffic turnover, brings about the circulation of regional businesses and personnel and promotes consumption and industry development. Some evidence from existing studies also suggest this, for example, ShunLi et al. (2005) showed that GDP and traffic turnover volume in China have a granger co-integration relationship, and Donzelli (2010) found that the spreading of the traffic demand increases the rate of international tourism, generates new jobs and improves the income of the area. Taking the direct impact of mobility on regional consumption into account, we take Total Retail Sales of Consumer Goods (TRSCG), denoted by E, as an economic indicator. At the same time, we adopt traffic turnover volume, denoted by , as the traffic indicator. We assume the economic impact is a convex function of traffic turnover volume. Considering a series of other economic factors, we construct an econometric model and analyze the impact of variables by regression. The economic growth contributed by traffic turnover volume can be calculated through the studying of marginal effect. The approximate economic growth function is shown in Eq. (18).
(18)
where is the traffic turnover volume, are other economic factors. And the marginal economic growth contributed by traffic turnover volume, denoted by , can be calculated by Eq. (19).
(19)
2.7. Objective function
For the policymakers, there are two goals they hope to achieve through implementing a travel ban. The first goal is to avoid the spreading of epidemic, which can be simplify treated as minimizing the number of confirmed patients, to make sure the cumulative number of infections is as few as possible. At the same time, they hope that the side effect of interventions, which can be treated as the effect on economic growth, can be as little as possible.
Hence, taking above consideration into account, the objective function can be formulated as Eq. (20):
(20)
where is the new confirmed cases at time step t in city n. In objective function, the first two terms and stands for the total number of confirmed cases and deaths, respectively. The last term stands for the total economic growth contributed by traffic turnover volume, which is hoped to be maximized. is the average treatment cost for a epidemic patient, and stands for death loss, which emphasizes the importance of saving lives though it is not the actual expenditure. The goal of the entire decision-making process is to minimize the above objective function.
3. Reinforcement-learning-based decision making framework
3.1. Motivation and setting
The dynamic travel ban decision problem is hard to solve directly. This is because the governors can only make policies based on previous and current epidemic data, and the actions made in any stage have an impact on the future state, which should be considered in the decision-making process. Owing to the stochasticity of traffic and epidemic pandemic, we cannot formulate the problem of determining the travel ban into a compact form. The next subsection will introduce an online framework for deciding the travel ban policy based on the reinforcement learning technique.
3.2. Online travel ban policy determination
The travel ban decision process can be treated as a Markov Decision Process (MDP). Considering the entire period for epidemic spreading, it can be divided into several days , and each day is a decision interval. At the start of each interval, the government decides the status of each link with different mode, close or open, and within the day, the status remains unchanged. The states in day t are defined as a vector , where stands for the average number of new hospitalized patients of the cities on both side of link in day . And we define as the initial number of the hospitalized patients. is a -dimensional vector, in which is the cardinality of the link set. Also, the actions in day t are defined as a vector , where is the status of link l in day t, and 1 means open while 0 means close. and stand for the set for , , respectively.
The policymakers decide policies based on the current number of confirmed cases. This process can be simulated as a learning process. The agent behaves in an environment according to a policy that specifies how the agent selects action at each state, in which policy π gives a map: . The goal of the agent is to find a best policy
to maximize the long-term expected cumulative reward at current time. To solve such a problem, a common objective is to learn an action-value function
, where , where is a discount factor. The learning process seeks to find a solution for the Bellman equation , where and S’ is the next state.
The problem is defined on the network, the dimension of space for action will reach , which is so large that can not be solved with traditional tabular methods. We present a framework of Twin Delayed Deep Deterministic Policy Gradient (TD3) (Fujimoto et al., 2018) for online determination of travel ban policy in each day based on the above definitions. TD3 is based on Actor-Critic (AC) framework and the AC method originate from Deep Q-Learning (DQN). AC method and its improved version have been widely applied in online decision-making problems, such as taxi dispatching (Kim et al., 2020) and adaptive traffic signal control (Aslani et al., 2017). TD3 avoids overestimation and reduces function approximation error of the AC method by delaying policy updates.
In day t, we define the reward as the sum of the economic growth contributed by traffic turnover volume and economic loss caused by confirmed cases and deaths in this interval, which is shown in Eq. (21). The total expected discount reward return . During the training process, TD3 also uses experience replay to sample previous transitions randomly, and thereby smooths the training distribution over many past behaviors.
(21)
We follow the research in Fujimoto et al. (2018) and create two critic networks as well as an actor network as well as their target network
. Clipped double-Q learning by two critic networks will help to avoid the overestimation of Q-value, which is shown in Eq. (22). At the same time, the parameters of the target network and actor network will be frozen for a fixed number of iterations while updating the critic network by gradient descent to address variance and enhance the stability of the algorithm. In this paper, all the networks are 3-layer fully connected network, in which the hidden layer has 256 cells.
(22)
The experience tuple is stored in the replay memory, and our algorithm samples uniformly at random from memory when performing updates. The update of critics follows Eq. (23), and that of actors follows Eq. (24). We use Adam's algorithm (Kingma and Ba, 2014) as the optimizer. The detailed procedure for our TD3 implementation is provided in Algorithm 2.
The last subsection introduces the executive procedure to synthesize the proposed RL framework with the solving framework for dynamically determining the travel ban and recovering policies. For the implementation, we first need to establish a simulation environment for the city network considering epidemic pandemic and people moving. The detailed execution procedure is illustrated in Fig. 3
. As observed, there are two key components in the framework, i.e., emulator, and RL brain. During the total studying period, the city network model coupled with the SEIHRD model continues to simulate the epidemic spreading process. First, the travel demand is generated by some demand models (e.g., the gravity model shown in Eq. (8)). After applying the travel ban from the RL brain, the network updates the link status and finishes the travel mode choice process. Then, those traveling people enter the link-based SEIHRD model while other people staying in cities enter node-based SEIHRD model. All the SEIHRD models are established separately and are independent of each other to consider the differences between cities and links. At last, the traveling people enter their target city, and the whole city network updates situation as well as calculates current reward. From the policymaker side, the RL brain receives the situation of epidemic spreading as the state, and then gives the current time travel ban policy (every link is open or not) to the city network. When one step simulation is over, the city network returns the reward.
In this section, we perform a variety of experiments to test the performance of our proposed travel ban decision framework for gaining some insights. We evaluate the solutions on the metrics of cumulative reward defined by Eq. (21).
4.1.1. Epidemic model settings
The parameters setting reflects the characteristics of the disease as well as has an important impact on the performance of the epidemic model. As humans’ understanding of COVID-19 is constantly in progress, the current selection of parameters is only given by the existing literature or news. The epidemic model parameter setting is shown in Table 1
. Asymptomatic patients can infect others, but there is no evidence supporting the infectivity of the asymptomatic patient is weaker than infectious patient(Wu, 2020), and we take . At the same time, the cure rate and fatality rate of unreported patients and hospitalized patients (reported patients) are different. As of July 30, the country with the highest crude fatality rate (death cases/confirmed cases) was Yemen, at 28.35%, followed by France at 16.33%. So we estimate that the fatality rate of unreported patients is 15%. Without a run on medical resources, the fatality rate of hospitalized patients is much lower than unreported patients. In early March, the fatality rate of patients in other provinces in China except Hubei was 0.86%. So we estimate the fatality rate of hospitalized patients is 1%.
However, there is no study to carefully analyze the infection rate on different travel tools. We set the infection rate of different modes according to the crowdedness and capacity of different modes as well as referring to the city infection rate. We set . Among those three collective transits, the airplane is the most crowded (economic class). Although the capacity of HST is the highest (considering all the cabins), the probability of people moving across cabins is not high. The vast majority of coach services are for short trips; coach passengers normally arrive at the station for a short time, and the infectious risk at the station is much lower than the airport or railway station. The research from Zhang et al. (2020) also suggests that flights contribute the most to confirmed cases, followed by HST and coach. For the private car, we assume the because the car can take up to 5 people, and it is generally not shared with strangers for intercity traffic. Besides, the infectious rate in transportation will not decay.
4.1.2. Economic growth model settings
We have established an econometric model to study the marginal contribution of traffic turnover to economic growth. We use Total Retail Sales of Consumer Goods (TRSCG) as the dependent variable and Public Finance Expenditure (PFE, denoted by ), Profits of Industrial Enterprises (PIE, denoted by ), Total Import and Export (TIE, denoted by ), total traffic turnover volume (denoted by ) as the explanatory variable. Mobility can well promote the development of the consumer industry. PFE reflects government intervention in society as well as the country's macroeconomic situation. PIE affects the level of disposable income. TIE reflects international trade, which affects a lot of industries in China. We collected monthly data of China from 2010 to 2019, a total of 120 observations, and regressed the model. The results indicate that has a significant impact on the TRSCG (). We estimate the marginal effect of , denoted by , and calculate the marginal economic growth from by . Besides, according to the public information,1
the average treatment cost in China and death loss are set as 32 thousand yuan and 200 thousand yuan, respectively.
The setting of affects the objective function. It also reflects the importance government places on economic development and epidemic control. The sensitivity analysis is conducted in section 4.2.
4.1.3. RL model settings
The setting of parameters affects the performance of RL algorithm, which is shown in Table 2
. We adjust the learning rate in different scenarios for better learning performance. The total studying period is 50 days, and the number of training iterations is 500.
We select 15 metropolises in China and collect the epidemic data (the number of confirmed cases, recovered cases, and death cases) as well as the intercity traffic data. The epidemic data is from the National Health Commission (NHC), and the traffic data can be collected from the Baidu immigration index. We collect the epidemic and traffic data from Jan.23th to Mar.13th (50 days), which involves a complete outbreak and the control process of COVID-19 in China. Facing the pandemic of COVID-19, most provincial governments initiated the first-level response and considered to perform a travel ban around Jan.23th, while millions of people were moving across the cities to return to their hometown. It was the day when Wuhan, the first city to report the COVID-19 case, cut off traffic connections with outside. About 50 days later, the COVID-19 pandemic in China had been roughly controlled. Another two weeks later, the Chinese government declared the indigenous pandemic interruption on Mar.28th.
4.2. Experiments on hypothetical city networks
4.2.1. Virtual city network construction
Firstly we perform our model in some virtual city networks. However, there may not have a direct road connection between some pairs of cities, i.e., the graph representing the network is not necessarily a complete graph. Some studies have pointed out that the terrestrial transportation network, such as railway network or highway network, is a type of small-world network (Aldrich et al., 2015; Viana & da Fontoura Costa, 2011). Also, the small-world network architecture is used to describe a real-world application for a next-generation airline network (Sawai, 2012). The small-world network is a type of mathematical graph in which most nodes are not neighbors of one another, but the neighbors of any given node are likely to be neighbors of each other and most nodes can be reached from every other node by a small number of hops or steps. In the existing literature, there are a lot of kinds of small-world networks, among which the best-known family was proposed by Watts and Strogatz (1998), called Watts-Strogatz (WS) network. A WS network is defined to be a network where the typical distance between two randomly chosen nodes (the number of steps required) grows proportionally to the logarithm of the number of nodes N in the network. The WS network is constructed process can be described by Algorithm 3.
We use the WS network to represent the traffic connections between cities. We generate three networks with and name them WS-10, WS-15, WS-20. Fig. 4
illustrates three city networks, which have 40, 60, 80 directed arcs, respectively.
Other information about the city is randomly generated within a certain range. The location of each node is generated randomly, and the distance between the two cities is calculated from their locations. The city population is a random number from 0.5 to 10 million. The travel demands between cities are generated by the gravity model introduced in section 2.3. The initial number of confirmed cases varies from 0 to 30. The time for the government to initiate the first-level response is set as 0. The travel time and ticket price of different travel modes are generated according to the distance between the two cities.
4.2.2. Real city network construction
In order to show the model's ability to apply in the real world, we select 15 metropolises in China and construct a real city network, named Real-15. The population, location, initial confirmed cases and the time to start first-level emergency response are shown in Table 3
. The start day is Jan. 23th, 2020. Considering that those cities are all metropolises, transportation is very convenient, so we assume the Real-15 is a fully-connected network.
At the same time, we collect the epidemic data (the number of confirmed cases, cure cases and death cases) and the traffic data. Although the immigration index is collected from vehicle data, it reflects the travel intensity in different cities.
Fig. 5 shows the change of migration index in the whole studying period. The migration index quickly decays to 0 in all the cities, especially in Wuhan because of the travel ban. And a lot of cities tried to recover intercity traffic in early March. Some southeastern cities. like Guangzhou, Shenzhen, and Chengdu, are recovering particularly fast. The Baidu map can also provide the share rate of migration index on different destination cities. Based on the migration index, sharing rate and standard volume, we can estimate the daily traffic demand between all cities during the whole period.
According to the real epidemic data and traffic data, we can calibrate our epidemic model for every city. The comparison between real data and simulation results is shown in Fig. 6
. The curve of cumulative confirmed cases fits well in most cities. We note that the curves produced by the SEIHRD model is smooth, but the real curves may not be as smooth due to some incidents, such as statistical errors, changes in diagnostic criteria, etc.
In three virtual city networks, the travel demand is generated by the gravity model shown in Eq. (8). In the real city model, the travel demand is calculated by the average travel demand from Jan.17th to Jan.19th. It is multiplied by a random number to reflect the randomness of demand.
We design two benchmark policies for comparison, i.e., all-ban policy and no-ban policy. Under the all-ban policy, all the links are close in the whole period, and the traffic turnover volume is zero. Conversely, all the links keep open under the no-ban policy and the traffic turnover volume is maximized. Fig. 7
illustrates the comparison between all-ban policy and no-ban policy in Real-15. Because very strict travel restrictions were implemented in real life, the number of confirmed cases under the all-ban policy is similar to the actual situation. However, if the no-ban policy is implemented, the number of infected people in cities other than Wuhan will rise sharply. Therefore, the travel ban can well prevent the spreading of the epidemic among different cities.
The comparison of benchmark policies in 15 cities.
The RL framework is implemented in city network WS-10, WS-15, WS-20 and Real-15. The learning performance in WS-10 is shown in Fig. 8
. The TD3 method converges well after 300 epochs. We set the higher objective function value of all-ban and no-ban policies as the benchmark, and we can find that the RL framework can find a better policy outperforming the benchmark with obvious improvements.
Convergence of TD3 framework online for travel ban policy decision.
Table 4 shows the comparison result of travel ban policies produced by the RL framework and benchmark policies. As mentioned above, we introduce two travel ban scenarios. In scenario I, the government decides both the close time and recovery time. In scenario II, the government only decides the recovery time. RL-SCO1 and RL-SCO2 stand for the RL strategies under scenario I and scenario II, respectively. It can be found that all-ban policy is slightly better than no-ban policy, and the policies applied in actual situations in Real-15 is better than no-ban and all-ban policies. However, the policies given by the RL framework beats all the benchmarks in all networks. In WS-10, the policy from RL-SCO1 achieves a lower number of confirmed cases while ensuring some mobility. In WS-15, RL-SCO2 gives the best policy, traffic turnover volume is guaranteed with slightly increasing confirmed cases. In WS-20, RL-SCO1 guarantees traffic turnover volume at the expense of a small increase in confirmed cases. In Real-15, RL-SCO1 and RL-SCO2 achieve nearly the same objective function value, though the policies are different. In short, the RL framework helps to come up with a better travel ban policy, which can increase the value of the objective function by 1.31%–5.46%.
Table 4.
Tabular comparison of learned policies and benchmark.
The setting of parameters in the objective function has a great effect on the decision. The values of reflect the importance placing on epidemic control and economic development, also affect our evaluation of the policy. Table 5
shows the sensitivity analysis in WS-15 under All-ban and No-ban policies. The values of will not affect the simulation result under two benchmark policies, but change the value of the objective function. We changed these values from a decrease of 60% to an increase of 100%. The number in the table represents the improvement of the All-ban policy compared to the No-ban policy. The upper right part of the table represents the case that economic operations are very important while the epidemic prevention and control are relatively not urgent, and in that case, No-ban policy is better. On the contrary, the case in the lower left part of the table puts epidemic prevention first. In this case, it would be better to adopt the All-ban policy.
Table 5.
Sensitivity analysis for parameters in objective function.
By analyzing the learned policy, we can also understand its characteristics. Fig. 9
illustrates the learned best policy in WS-15 under scenario II. The x-axis is time and the y-axis is the percent of open links in each day. The policy makes the open rate increase with time. However, we can observe that the open rate of the private car keeps the highest while that of HST is the lowest. The result suggests that we can develop different policies for different travel modes, for example, consider removing highway restrictions first.
At the same time, we find some characteristics of the early open link. Fig. 10
illustrates the bar-plot of link demand in the middle period in WS-10. Some links with low demand are recovered first. The result suggests that we can consider recovering low demand traffic links, which can guarantee the necessary economic connections without too much risk of the epidemic spreading.
The pandemic of COVID-19 has greatly affected the economic development and people's daily lives across the world. Public transport increases the risk of people-to-people contact and aggravates the spread of COVID-19 across cities. It is important for the government to take effective policy intervention to reduce epidemic damage. Meanwhile, the travel restriction also disrupts a wide range of industries (e.g., retail sales, tourism, entertainment, etc.), causing the recession of the national economy. For example, China's GDP in the first quarter fell by 5.3% compared to the same period last year. Therefore, it is very important to develop effective policy-making criteria and implement them prudently for balancing the above sides.
In post COVID-19 world, on the one hand, the pandemic of COVID-19 continues one after another almost in every corner of the world. In countries such as Spain, France and Switzerland,2
strict travel restriction policies are being implemented. From SARS(2003), MERS(2015) to COVID-19(2019), mankind is still facing threats from many other infectious diseases (e.g. Ebola, Zika, H7N9). With the availability of big data nowadays, data-driven decision-making becomes crucial for governments. In the early stages, epidemiologists can obtain some key parameters of the infectious disease as well as some characteristics from the confirmed cases, including infectious rate β. With the key information, the model can simulate inter-city spreading and provide useful suggestions for travel managers. At the same time, with the proposed methodology and numerical experiments, we can learn that:
•
Cutting off all the traffic connections is essential in the early stages of the severe spread of the epidemic.
•
Highway can be considered the first to recover because traveling with private cars owns the least cross-infection probability.
•
HST, airplane, and coach should be carefully considered reopen because the clustering risk still exists; it may be infected not only during the trip but also at the stations with potentially huge crowd.
•
Some traffic links with relatively low demands can be recovered first, as they can guarantee necessary economic connections without too much risk of the epidemic spreading.
5. Conclusions
This paper develops a methodological framework for multi-modal dynamic travel ban policy determination under the pandemic of epidemic to balance epidemic control and regional economic development. Coupled with SEIHRD epidemic model, we develop a city network model containing multi-modal traffic between cities. By applying node-based and link-based epidemic models, the city network model takes the risk of infection in cities and transportation into account. Base on MNL, an mode choice model is established to allocate demand under capacity constraints. At the same time, mobility plays an important role in the social-economic system as well as has a direct or indirect effect on many industries. We establish a model to estimate the marginal economic growth contributed by traffic turnover volume in order to demonstrate the trade-off between the control of epidemic spreading and economic loss.
For supporting the dynamic decision-making, we develop an RL-based framework for online multi-modal travel ban decision. We adopt TD3 as the RL algorithm to reduce variance and avoid overestimation. We construct three virtual city networks using WS network, WS-10, WS-15, WS-20. as well as a real city network Real-15. The calibration results indicate that our model fits well with realistic data. We adopt all-ban and no-ban policies as the benchmarks. The policies given by the RL framework outperform any benchmark and increase by 1.31%–5.46% in four city networks in terms of the objective function, which considers the losses caused by the spread of the epidemic and the economic growth contributed by traffic turnover volume. At the same time, the learning result indicates that differentiating policies for different intercity travel modes can be considered.
This study introduces the RL framework in network-level epidemic modeling and considers the online multi-modal travel ban decision problem. Our study only provides a preliminary sketch; several extensions can be made in future studies. First, some additional policies considering the feature of the complex network can be studies. Second, the economic growth model can be further enhanced for mining the microscopic mechanisms to achieve better estimation on actual cases. Third, our numerical example only considers the realistic cases in China, but now the balance between epidemic control and economic maintenance is an urgent issue faced by most of governments around the world. Lastly, the current-adopted epidemic model only considers one-wave spreading, and how to extend it to incorporate multi-wave cases remains a valuable and challenging topic to be investigated in the future.
Acknowledgement
This research is supported by grants from National Key Research and Development Program of China (2018YFB1601600).
Aldrich P.R., El-Zabet J., Hassan S., Briguglio J., Aliaj E., Radcliffe M., Mirza T., Comar T., Nadolski J., Huebner C.D. Monte Carlo tests of small-world architecture for coarse-grained networks of the United States railroad and highway transportation systems. Phys. Stat. Mech. Appl. 2015;438:32–39. [Google Scholar]
Aron J.L., Schwartz I.B. Seasonality and period-doubling bifurcations in an epidemic model. J. Theor. Biol. 1984;110:665–679. doi: 10.1016/s0022-5193(84)80150-2. [DOI] [PubMed] [Google Scholar]
Aslani M., Mesgari M.S., Wiering M. Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events. Transport. Res. C Emerg. Technol. 2017;85:732–752. [Google Scholar]
Beck M.J., Hensher D.A. Insights into the impact of covid-19 on household travel and activities in Australia the early days under restrictions. Transport Pol. 2020;96:76–93. doi: 10.1016/j.tranpol.2020.07.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen T.-M., Rui J., Wang Q.-P., Zhao Z.-Y., Cui J.-A., Yin L. A mathematical model for simulating the phase-based transmissibility of a novel coronavirus. Infect. Dis. Poverty. 2020;9:1–8. doi: 10.1186/s40249-020-00640-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chinazzi M., Davis J.T., Ajelli M., Gioannini C., Litvinova M., Merler S., Pastore y Piontti A., Mu K., Rossi L., Sun K., Viboud C., Xiong X., Yu H., Halloran M.E., Longini I.M., Vespignani A. The effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak. Science. 2020;368:395–400. doi: 10.1126/science.aba9757. [DOI] [PMC free article] [PubMed] [Google Scholar]
Donzelli M. The effect of low-cost air transportation on the local economy: evidence from southern Italy. J. Air Transport. Manag. 2010;16:121–126. [Google Scholar]
Fujimoto S., Van Hoof H., Meger D. 2018. Addressing Function Approximation Error in Actor-Critic Methods. arXiv preprint arXiv:1802.09477. [Google Scholar]
Guan D., Wang D., Hallegatte S., Davis S.J., Huo J., Li S., Bai Y., Lei T., Xue Q., Coffman D., et al. Global supply-chain effects of covid-19 control measures. Nat. Human Behav. 2020:1–11. doi: 10.1038/s41562-020-0896-8. [DOI] [PubMed] [Google Scholar]
Inoue H., Todo Y. Firm-level propagation of shocks through supply-chain networks. Nature Sustain. 2019;2:841–847. [Google Scholar]
Kermack W.O., McKendrick A.G. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. - Ser. A Contain. Pap. a Math. Phys. Character. 1927;115:700–721. [Google Scholar]
Kim B., Kim J., Huh S., You S., Yang I. Multi-objective predictive taxi dispatch via network flow optimization. IEEE Access. 2020;8:21437–21452. [Google Scholar]
Kingma D.P., Ba J. 2014. Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980. [Google Scholar]
Kucharski A.J., Russell T.W., Diamond C., Liu Y., Edmunds J., Funk S., Eggo R.M., Sun F., Jit M., Munday J.D., et al. The lancet infectious diseases; 2020. Early Dynamics of Transmission and Control of Covid-19: a Mathematical Modelling Study. [DOI] [PMC free article] [PubMed] [Google Scholar]
Leung K., Wu J.T., Liu D., Leung G.M. First-wave covid-19 transmissibility and severity in China outside hubei after control measures, and second-wave scenario planning: a modelling impact assessment. Lancet. 2020;395(10233):1382–1393. doi: 10.1016/S0140-6736(20)30746-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
Li R., Pei S., Chen B., Song Y., Zhang T., Yang W., Shaman J. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2) Science. 2020;368:489–493. doi: 10.1126/science.abb3221. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nakamura H., Managi S. Airport risk of importation and exportation of the covid-19 pandemic. Transport Pol. 2020;96:40–47. doi: 10.1016/j.tranpol.2020.06.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ni S. Tsinghua University; 2009. Research on Modeling of Infectious Disease Spreading Based on Complex Network Theory. Ph.D. thesis. [Google Scholar]
Oum T.H., Wang K. Socially optimal lockdown and travel restrictions for fighting communicable virus including covid-19. Transport Pol. 2020;96:94–100. doi: 10.1016/j.tranpol.2020.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sawai H. 2012 IEEE Congress on Evolutionary Computation. IEEE; 2012. Reorganizing a new generation airline network based on an ant-colony optimization-inspired small-world network; pp. 1–8. [Google Scholar]
ShunLi W., LinLiu J., PingXu G. Central South Highway Engineering; 2005. Co-integration Analysis of the Impact of the Gdp Net Growth on the Traffic Turnover Volume Net Growth; p. 51. [Google Scholar]
Tang B., Wang X., Li Q., Bragazzi N.L., Tang S., Xiao Y., Wu J. Estimation of the transmission risk of the 2019-ncov and its implication for public health interventions. J. Clin. Med. 2020;9:462. doi: 10.3390/jcm9020462. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tian H., Liu Y., Li Y., Wu C.-H., Chen B., Kraemer M.U.G., Li B., Cai J., Xu B., Yang Q., Wang B., Yang P., Cui Y., Song Y., Zheng P., Wang Q., Bjornstad O.N., Yang R., Grenfell B.T., Pybus O.G., Dye C. An investigation of transmission control measures during the first 50 days of the covid-19 epidemic in China. Science. 2020;368:638–642. doi: 10.1126/science.abb6105. [DOI] [PMC free article] [PubMed] [Google Scholar]
Viana M.P., da Fontoura Costa L. Fast long-range connections in transportation networks. Phys. Lett. 2011;375:1626–1629. [Google Scholar]
Wells C.R., Sah P., Moghadas S.M., Pandey A., Shoukat A., Wang Y., Wang Z., Meyers L.A., Singer B.H., Galvani A.P. Impact of international travel and border control measures on the global spread of the novel 2019 coronavirus outbreak. Proc. Natl. Acad. Sci. Unit. States Am. 2020;117:7504–7509. doi: 10.1073/pnas.2002616117. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wu J.T., Leung K., Leung G.M. Nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, China: a modelling study. Lancet. 2020;395:689–697. doi: 10.1016/S0140-6736(20)30260-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wu Z. Contribution of asymptomatic and pre-symptomatic cases of covid-19 in spreading virus and targeted control strategies. Zhonghua liu xing bing xue za zhi= Zhonghua liuxingbingxue zazhi. 2020;41:801–805. doi: 10.3760/cma.j.cn112338-20200406-00517. [DOI] [PubMed] [Google Scholar]
Zhang Y., Zhang A., Wang J. Exploring the roles of high-speed train, air and coach services in the spread of covid-19 in China. Transport Pol. 2020;94:34–42. doi: 10.1016/j.tranpol.2020.05.012. [DOI] [PMC free article] [PubMed] [Google Scholar]