Abstract
Disease propagation between countries strongly depends on their effective distance, a measure derived from the world air transportation network (WAN). It reduces the complex spreading patterns of a pandemic to a wave-like propagation from the outbreak country, establishing a linear relationship to the arrival time of the unmitigated spread of a disease. However, in the early stages of an outbreak, what concerns decision-makers in countries is understanding the relative risk of active cases arriving in their country—essentially, the likelihood that an active case boarding an airplane at the outbreak location will reach them. While there are data-fitted models available to estimate these risks, accurate mechanistic, parameter-free models are still lacking. Therefore, we introduce the ‘import risk’ model in this study, which defines import probabilities using the effective-distance framework. The model assumes that airline passengers are distributed along the shortest path tree that starts at the outbreak’s origin. In combination with a random walk, we account for all possible paths, thus inferring predominant connecting flights. Our model outperforms other mobility models, such as the radiation and gravity model with varying distance types, and it improves further if additional geographic information is included. The import risk model’s precision increases for countries with stronger connections within the WAN, and it reveals a geographic distance dependence that implies a pull- rather than a push-dynamic in the distribution process.
Author summary
For the spread of a contagious disease, human mobility puts distant places in proximity and geographically closer targets may be effectively much further away. The worldwide flight network is crucial for long distance travels and the previously proposed ‘effective distance’ translates this mobility into a distance measure that correlates with the disease arrival time. We use the effective distance to generate a bottom-up and thus parameter-free distribution process of passengers on the flight network, which takes into account all possible flight routes. This allows us to determine the import probability of a disease. Our ‘import risk’ model outperforms or matches established mobility models, some of which require calibration with scarce or costly data. In contrast, our approach relies on minimal flight network data, that is the number of planes between airports and their passenger capacities, but not on passenger data. Its bottom-up approach enables future studies on country-specific measures for controlling and containing infected passengers, a challenge with existing models. Thus, the ‘import risk’ model’s strength lies in its data simplicity, this relevance to pandemics, and parameter-free design.
Introduction
The recent decades have seen a considerable increase in mobility: The worldwide number of passenger cars in use increased by an average of about 4% each year between 2006 and 2015, reaching approximately 1 billion in 2015 [1]. This growth is comparable to the yearly increase in the number of sea containers shipped [2], and the global scheduled air passenger count also experienced an annual growth of about 6% between 2004 and 2019 [3] In essence, the world is becoming increasingly interconnected in terms of passenger mobility, both on a small scale (cars) and a large scale (air traffic), as well as in the import and export of goods. This heightened connectivity facilitates the distribution of goods and people, as demonstrated by the distribution of over 400 invasive species through agricultural imports, which is best predicted by the global trade network [4]. A prime example of unwanted side effects of well-connected regions is the potential for pandemics, accompanied by death, economic damage and the potential stigmatization of survivors, migrants and minorities [5–7]. Already the first plague pandemic that started AD 541 in the Nile Delta of Egypt spread in 8 years across the territories (Mediterranean, Northern Europe and Near East) of 2 affected empires because of the intense commerce in the Roman Empire [6]. Nowadays, the intensified exchange reduces the time until a pandemic reaches all parts of the world to months as for the 2009 H1N1 virus that spread from Mexico in 5 months to all continents [8, 9] or the recent COVID-19 pandemic whose variants spread within a few months across the globe [10–13].
The connection strength between world regions is only partly explained by their geographic proximity. Instead, due to historic geopolitical relations [14, 15] pandemics spread rather along an effective distance that is derived from the world air transportation network (WAN) [16–19], or, if applied on a smaller scale, also from other means of transportation [16, 20]. According to the effective distance, region B is closest to region A if the passenger flow from A to B is greater than to other destinations. An intriguing extension is the multipath effective distance, which enhances the prediction of disease arrival times by considering all paths taken by a random walker on the WAN [17]. The effective distance is regularly used to analyze the impact of mobility on the spread of diseases, as for example for MERS [21], Ebola [22], Zika [23] and most recently COVID-19 [20, 24–26]. While it enables a qualitative estimation of disease arrival times, its applicability is severely restricted when it comes to describing the importation of infected passengers from a specific source to a target. However, these import events are highly relevant for political decision-makers and to enable modeling predictions.
In this work, we describe these import events via the “import probability” p(B|A), which is equivalent to the origin-destination (OD) matrix whose element TBA represents the number of trips from A to B, with the difference that the probability is normalized by all trips starting in A, i.e. p(B|A) = TBA/TA. There are mobility models that fit the OD matrix, requiring a reference OD matrix as seen in the gravity model [27–31]. Additionally, some models integrate OD matrix-fitted models on a smaller scale with the OD matrix of the global air transportation network, creating a multiscale mobility network to represent all modes of transportation [32, 33]. Note that the multiscale mobility model has been successfully employed to analyze past pandemics [34–36]. Yet, it can be extremely difficult to obtain the OD matrix and most often it is estimated by small surveys [37] or alongside a census [38]. Even for the air transportation network derived from a booking system, the OD is only an approximation since passengers increasingly book directly at the airlines (in 2015 30% of all Lufthansa flights were booked directly which increased to 52% in 2018 [39]) and not via the big GDS (global distribution systems) from which most OD-estimates are derived [40, 41]. This means that to exactly compute the air transportation OD matrix, bookings of all GDSs and about 900 airlines must be purchased/estimated and combined. Thus, models that do not rely on an existing reference OD matrix are important and those either assume an underlying decision process without integrating traffic information as the radiation model [42, 43] or they apply a maximum entropy approach to distribute the unknown OD trips along possible routes of a known traffic network [30, 44, 45]. However, none of the above approaches use the effective distance with its qualitative link to disease propagation and none is based on a mechanistic distribution process on a traffic network. To our understanding, a mechanistic process mimics the detailed movement behavior of the passengers on the traffic network, and neither uses only quantities of and between the locations (gravity and radiation model) nor relies on principles of system in thermodynamic equilibrium (maximum entropy model), in other words it is a bottom-up approach. This approach grants us a mechanistic understanding of the observed patterns, enabling us to investigate how modifications impact passenger distribution. For instance, we can analyze how containment interventions along distribution routes reduce the import probability of infected passengers.
In this work, we introduce the import risk model, based on a distribution process following the shortest path tree of the WAN based on effective distance. This process is combined with a random walker that explores all potential paths within the WAN. We are using WAN data from the year 2014 and compare it to the Global Transnational Mobility Dataset from 2014 [40], as a ground truth baseline. Additionally, we investigate the discrepancy to the import risk and alternative mobility models as the gravity [27, 31] and radiation model [43] through multiple comparison measures. We find that the import risk model outperforms the alternative models and improves only slightly when it includes not only WAN information but also the geodetic distance between airports. Lastly, we evaluate the quality of import probability estimation for specific countries and assess if and how the geodesic distance is encoded in the import risk estimate.
Results
Relating the WAN, OD-probability and the effective distance
In this work, we introduce the import risk, which estimates the probability of a passenger departing from airport A to conclude their journey at any airport worldwide, even those not directly connected to the origin airport. The estimation is based on the traffic flow of airplanes and the respective maximal passenger capacity between airports, a.k.a. the world air transportation network (WAN), provided by the Official Airline Guide (OAG) [46]. This inference-problem is intriguing because it is much easier to monitor the origin and destination of airplanes, than of passengers with possibly multiple connecting flights until their final destination. In our study, we use the WAN from 2014 (Fig 1A) and compare the derived import probabilities to a reference dataset. The reference import probability is based on the Global Transnational Mobility Dataset (GTN) from 2014 [40, 47], which combines an origin-final-destination dataset from a major global distribution system (GDS) with a tourism dataset from the World Tourism Organization (Fig 1B, see Material and methods for more details on the data). Before introducing the import risk model, we contrast the two datasets, introduce the effective distance [16] and quantify its potential as the base metric for our proposed model.
By comparing the world air transportation network (WAN) with the country-specific reference import probability from the GTN (compare Fig 1A and 1B), we see that the airports connected via direct links belong to countries that also have a high import probability. Nevertheless, due to physical constraints and logistical optimization, not all countries with non-zero import probabilities are directly connected to airports in the source country; instead, they are reached via connecting flights. In the context of import probability, estimates based on geodesic distance and the population of the target country are useful but exhibit limitations in certain scenarios. For instance, the import probability for Italy is approximately 1.4 times greater than that for Germany, even though Germany is geographically closer to Canada and has a larger population. The effective distance is an alternative network-based distance measure that does not rely solely on direct connections and geographic information [16–19]. Instead, it is based on the passenger flow Fij from j to i and its relationship to the outflow Fj through the transition probability Pij = Fij/Fj. Together with a constant distance offset d0, the effective distance between directly connected airports is
(1) |
The effective distance between airports without direct connection is the cumulative distance along the shortest path tree (SPT) derived from deff, as illustrated for the largest Canadian airport (Toronto Pearson Airport, YYZ) in Fig 1C. Note that a distance offset of d0 = 0 would make two routes indistinguishable as long as the product of the transition probabilities along each route is the same, but with d0 > 0 the one route with fewer connecting flights is effectively shorter. Previous studies have demonstrated that the arrival time of diseases in countries exhibits a linear dependence on their effective distance [16–19]. We show that the import probability also correlates with deff (Fig 1D), whereby the correlation is higher than for other distance measures (see Fig A in S1 Text). In fact, the import probability decays exponentially with effective distance (linear decay on a semi-log scale in Fig 1D) which can be reproduced in a simplified model for a passenger that travels at a constant effective speed and has a constant exit rate. Therefore, the effective distance seems to be a good representation of the underlying distribution process, and is a promising candidate for the base of our proposed import risk model, to directly estimate the import probability.
Import risk model
The idea behind the import risk model is a combination of two elements: (i) a random walk with an exit probability of the walker to finish its travel at the current node and (ii) a distribution mechanism derived from the deff SPT (Fig 2). The use of a random walk is motivated by Iannelli et al. [17] who could improve the arrival-order prediction of deff by including all possible paths. The exit probability enables us to combine the random walk with a distribution mechanism that assigns the likelihood of each node being the final destination, as explained in detail in the second step. In the first step, we use the transition network representation of the WAN and let a random walker start at source n0 and after each step it either exits at the current node i with exit probability qi or continues to walk. Let us define the walker’s probability to continue walking to node n given it was at node n − 1 before and originally started in n0 by
(2) |
with Pn,n−1 as the transition probability from n − 1 to n. Now the probability to walk along a path Γ starting at n0 and exiting at n is the probability to continue walking Si,j along each link (i, j) that is part of the path times the exit probability of the final node
(3) |
where we omitted the explicit dependence on the source n0. Our goal is to describe all possible paths the walker can take from n0 to n. We will use the matrix S, whose elements are the probabilities to continue walking Si,j. The element (i, j) of the product of the matrix with itself S ⋅ S = S2 sums over all paths of length l = 2 that end at i and start at j. Next, we can define the probability of a walker to exit at n after traversing all paths of length l as
(4) |
Finally, the import risk is the probability to exit at n given all paths of all lengths
(5) |
where we used the convergence of the geometric series with identity matrix I.
In the second step, we approximate the exit probability qi(n0) that we used above, but did not specify yet. Thereby, we assume that passengers start at source airport n0, travel along the SPT and exit at node i with an exit-probability
(6) |
with N(i) as the population at airport i and Ω(i|n) as the set of all offspring nodes downstream of i on the SPT centered at source n0. Hence, the exit probability at node i is determined by the ratio of the population at node i to the combined populations of all downstream nodes of i on the SPT, inclusive of node i.
We estimate the population at airport i using its outflow on the WAN, denoted as N(i) = Fi. To aggregate the import probabilities at the country level, we sum the targets and apply a weighted average to the source airports, with population serving as the weighting factor.
To elucidate how additional information about the geographic distance between nodes influences p∞, we explore two variations of the import risk model: In the variation with “geodesic distance weighted” exit probability the populations in Eq 6 are substituted with , where is the geodesic distance between i and n0. To control for increasing model complexity, we study the “effective distance weighted” exit probability, where , i.e. no geographic information is used, but the model structure is equivalent.
Alternative models
Numerous alternative models estimate the OD-matrix, from which the import probability can be derived [30, 31, 42, 43, 49–52]. Among those, the gravity [27] and the intervening opportunity [42, 43] model are most widely used. A recent variant of the latter is the radiation model [43]. Although past studies have found that the gravity model outperforms the radiation model at small scale [38, 53, 54], especially the radiation model’s good performance at the large scale [38, 54] makes it an interesting model for mobility on the WAN. It was originally conceptualized for commuter flows [43] where the surrounding populations serve as a proxy for possible job opportunities. By estimating an airport’s population based on its outflow, we adjust the concept from job opportunities to tourism opportunities. Its derivation from a mechanistic decision process makes it parameter free, and therefore similar and a good comparison to our model. However, it only requires information on the population density and does not integrate flight data.
We compare our model to the gravity model with an exponential and power-law distance dependence and the radiation model (see Material and methods for definitions). These models solely rely on the outflow data from the WAN to estimate the node’s population and the geographic locations. To incorporate structural information of the WAN [55], the alternative models are also implemented with the geodesic path distance (the geodesic distance along the SPT) and the effective distance, i.e. there are in total nine alternative models: the radiation model, the gravity model with exponential and with power-law distance decaying function, and each implemented with geodesic, geodesic path and effective distance. The exponents of the six gravity models are fitted to the reference import probability by assigning the best fitting exponent to each of the six comparison measures (Pearson correlation, root-mean-square error, common part of commuters, Kendalls rank correlation and the correlation and RMSE of the logarithmic measures, all defined in Material and methods) and taking their mean value (see Figs B and C in S1 Text). As comparison measures, we have chosen three measures that are related to the absolute error and three that are related to the relative error between estimate and reference.
Symmetry by returning visitors
Each of the twelve models provides an estimate for the import probability p(i|n0), which is used to compute the OD-matrix T through multiplication with the corresponding source population N(n0). By comparing the symmetry of T with the reference OD-matrix , we find a much higher and qualitatively different symmetry in the reference data (see Supplementary Note B, Fig D in S1 Text). The high symmetry is likely due to visitors (family, business, tourism, etc.) that dominate the international travel. They return to their home-location after a limited period [56] and only the minority of the travelers are migrants, i.e. stay permanently at the destination. Interestingly, the import risk model has the highest symmetry, but is still less symmetric than the reference data by a factor of 4. Therefore, before conducting a detailed comparison of the estimates, we rectify the import probability estimates by symmetrizing their OD-matrix (by extracting the symmetric part and recalculating the import probability; for further details, refer to Material and methods and Supplementary Note B in S1 Text). This correction can be seen as an alternative version of a doubly constrained model where normally the constraints on in- and out-flow are ensured by an iterative proportionate fitting [31].
Model comparison
In the subsequent analysis, we evaluate the import probability estimates against the reference data through four approaches: (i) a direct comparison and assessment of their medians to identify potential systematic errors, (ii) the application of six distinct goodness-of-fit metrics to assess the individual model’s rank and relative performance, (iii) a classification task identifying countries with the highest import risk, particularly relevant in the context of a pandemic and (iv) a correlation study of the arrival time of 20 diseases and SARS-CoV-2 variants.
Qualitative comparison
In Fig 3 the import probability estimate p(i|n0) of each model is compared to the reference import probability . The gravity models exhibit the closest agreement with the reference data when the effective distance is employed, as indicated by the medians (Fig 3, first and second columns). In contrast, the median values of the radiation and import risk models are relatively stable and less influenced by variations in distance metrics or their associated weighting (third and fourth columns). All models overestimate the lowest median import probability (leftmost orange dot in Fig 3), since the estimated import probability is always nonzero, but a large proportion of the lowest reference import probabilities are zero due to the limited observation period and/or an insufficient number of departing passengers. The overestimation of the median import probability is observed up to p(i|n0) ≤ 10−4 for both the gravity and import risk models. However, this overestimation is notably absent in the case of the gravity model with an exponential distance decaying function and the effective distance metric (Fig 3I), where the median demonstrates the closest alignment with the reference data. The radiation models (third column) systematically overestimates the highest import probabilities (p(i|n0) ⪆ 10−1) and consequently underestimates the lower import probabilities.
Goodness of fit by multiple measures
We compared each model with the reference import probability via the Pearson correlation, the root-mean-square error (RMSE), and the common part of commuters. These measures are more sensitive to strong links, i.e. large import probabilities, which is important when the emphasis is placed on the countries that are most likely to import passengers. However, if the focus is to get a fair comparison including all links, logarithmic versions of the above measures or rank correlations are more appropriate. Thus, we also quantify the agreement by the correlation and the RMSE of the logarithm of the measures and by Kendall’s rank correlation. The three import risk model variations outperform the other models in all but one measure, whereby the variation employing the geodesic distance weighted exit probability performs best (Fig 4A). Following the import risk models, the two gravity models based on effective distance also exhibit strong rankings. In contrast, the remaining models lack consistent high rankings across all six measures and are more evenly distributed within the lower half. This model categorization also holds for the relative performance of the models (Fig 4B), with linear scaling of values in between (see Eq 22). In contrast to the rankings, the median relative performance shows a notable improvement when the gravity models incorporate effective distance. However, among the import risk models, the difference in median relative performance remains marginal.
The only measure where the import risk models are outperformed by the gravity models with effective distance is the logRMSE (Figs E, F in S1 Text). It is expected from the gravity models’ good agreement in median import probability with the reference data over wide ranges and the overestimation of low import probability by the import risk model. This overestimation can be reduced by model-modifications that introduce parameters favoring the exit at nodes with large-populations (for details, see Supplementary Note C and Figs G, H in S1 Text). However, we refrain from adding complexity to the model, since its generic nature is its key aspect.
Classification of ten top risk countries
In a pandemic context, it is of specific interest to identify the countries with the highest import probability. We analyzed how well the twelve proxy models can classify, if a country is among the ten countries with the highest import probability. Again, the import risk models outperform the other models and the one with geodesic distance-weighted exit probabilities is the top predictor with a sensitivity of 71.1% (Fig 5D). All effective distance-based models have a high sensitivity (≳ 65%), including the radiation model with 66.8% that had the lowest relative performance and second-lowest mean rank (Fig 5I–5K). For these high import probabilities, the import risk models now outperform the other models also in terms of RMSE and logRMSE, i.e. the 10 countries at highest risk are not only classified best by the import risk model, but also quantitatively assessed best.
Disease arrival time
In our final comparison, we evaluate the correlation between disease arrival times and the estimated import probability from the outbreak country of the disease. Note that the effective distance, which is the base of the import risk model, already has the clear relation to disease arrival times and the import risk model is developed to extend this qualitative relation to a quantitative number of passengers imported, as done in a recent study on the pandemic potential of SARS-CoV-2 variants [11]. However, a qualitative comparison to arrival time is of course possible via the negative logarithm of the import probability for each model, which we refer to as effective model distance, which linearly relates [16, 19] to the arrival time tA(i|j) of a disease
(7) |
with j as the disease outbreak country. The arrival time tA(i|j) is the number of days between the disease outbreak and the day the first case is reported in the target country i. We evaluated the correlation C(tA, dM) for the H1N1 pandemic starting 2009 [8], the COVID-19 pandemic starting 2019 [57] and 18 of its variants. Additional to the import probability models, the correlations of the geodesic, geodesic path and effective distance with tA are included. Our analysis reveals that models employing the effective distance as the distance measure consistently outperform those relying on the geodesic or geodesic path distance (Fig 6A). Interestingly, the gravity model with a power-law decaying distance function consistently performs well, regardless of the specific distance measure employed. We do not observe a specific model that excels exclusively for certain diseases. Instead, we observe similar correlation values for the same disease across models (Fig 6B), which suggests that there is considerable noise on the arrival time tA that varies between diseases. The noise could be related to the disease specific spreading speed: our assumption, that the outbreak country is the sole source, gets increasingly violated the slower the disease spreads, because other countries become secondary sources. A simple linear regression of the mean correlation 〈C(tA, dM)〉 and the mean arrival time 〈ta〉 supports this hypothesis (r = −0.44, p = 0.055, Fig K in S1 Text).
Import risk of countries and regions
Having quantified the performance of the import risk model, we now focus on (i) country specific differences in its prediction quality, (ii) possible limitations due to no concept of administrative units (e.g. countries) whose airports are more interconnected and (iii) how the geodesic distance is encoded in the import risk model, i.e. how a distance dependence emerges from WAN information only.
Country specific performance
In the import risk approach, we assume minimal knowledge of the system, i.e. only the WAN is known. Consequently, we differentiate countries only via their network properties, one of which is the degree of a node, or more precisely the node strength, since the WAN is a weighted network. It is the simplest metric that is also easily adjustable for the country-level perspective. At the country level, the node strength corresponds directly to the flow out of country C
(8) |
This country-specific characteristic signifies a country’s potential to influence the network’s structure, since flows from small-outflow countries are diluted by large-outflow countries. From an ecological point of view, the outflow is strongly correlated with the gross domestic product of a country (Fig N in S1 Text). The correlation (logcorr) between the logarithms of the import risk p∞ and the reference import probability improves with the outflow of the source country (Fig 7), as illustrated by Great Britain (GB) as the country with the largest outflow in the WAN and Eritrea (ER) as one of the countries with the lowest outflow. The prediction improvement with the country’s outflow suggests that the WAN is dominated by large-outflow countries and therefore predictions worsen for countries with lower WAN outflow. However, the prediction improvement is also present in model alternatives that do not use WAN information at all (e.g. gravity with geodesic distance, Fig M in S1 Text). We rule the explanation out that the alternative models show this improvement due to preferential fitting of strong links—and therefore of large-outflow countries—since the models are fitted to the reference data by their import probabilities, which ensures equal weighting among countries. It rather suggests that the mobility behavior in low outflow regions is different, also supported by the sudden performance saturation for countries with a WAN outflow of FC ≳ 106 (Fig 7 and Fig M in S1 Text). Possibly, their passenger distribution is constrained by additional factors and is limited to the regions in proximity.
There are clear exceptions where the import risk estimation is worse compared to outbreak countries with a similar WAN outflow, as Australia (AU), Israel (IL) and Macao (MO). These countries are connected due to historical relations to specific regions that are either not in their direct neighborhood (European countries for AU and IL) or that are more important than the bare neighborhood would suggest, as Macao that is a special administrative region of China. For Macao the import risk to China is underestimated, which consequently overestimates the import to other countries, and for AU and IL Europe is underestimated which overestimates other regions (Fig 7). AU, IL, and MO serve as examples illustrating that the WAN may not fully encapsulate all relevant information accessible to the import risk model. Another concept that is missing in our methodological approach is the idea of a country or another administrative unit. Instead, it treats airport pairs uniformly, disregarding their country affiliations. Since we know the international flights leaving a specific country from the WAN, we can run a self-consistency analysis, i.e. without the need of reference import probability data. We can estimate the outflow leaving the country C by the import risk model by
(9) |
If we compare it to FC the WAN flow out of country C (see Eq 8), it turns out that the import risk model systematically overestimates the flow out of a country (Fig I panel A in S1 Text). In fact, the relative error increases with the number of airports belonging to the country (Fig I panel B in S1 Text). Possible explanations for this overestimation include the absence of a country-specific concept within the import risk model and the unintentional inclusion of transit passengers in the population count of airport catchment areas (since we use the outflow as a proxy for the population). However, we can easily correct for this overestimation on country-level analysis, by normalizing the airport population such that the WAN country outflow is recovered.
Geodesic distance dependence
The import risk model estimates import probabilities without explicit geodesic-distance information (excluding the variant with distance weighted exit probability). Since classical models have proven distance to be a good predictor for human mobility, we assume that it is encoded in the WAN structure and by consequence in the import risk estimate [58]. To enhance clarity, we aggregate the import risk data across twenty-two world regions. We observe that the import risks to individual targets decrease in a manner resembling a power-law as the geodesic distance to the sources increases (Fig 8A and 8B and Fig L in S1 Text). When we change our perspective and examine the distance-dependence from a single source to all target regions (Fig 8D and 8E), the observed dependence is less consistent with a power-law fit of the form (Fig 8C). This is surprising, since the import risk is computed via a source-centric view (by computing the exit probability from the shortest path tree originating at each source), which suggests that the distance dependence should be best from one source to its possible targets. A possible explanation is that each target possesses its own attractiveness independent of the source region. This suggests that the distribution dynamics may resemble a pull mechanism rather than a push mechanism. Indeed, we find that the fitted exponent α from the power-law fit decreases as the WAN flow out of the target region increases, which can serve as a proxy for the attractiveness of a region (Fig 8F). In other words, the more attractive a region, the larger the import risks from more distant source regions. The fitted exponent c has a high rank correlation with α (τKendall = 0.89), i.e. also the coefficient is dependent on the attractiveness of the region.
Discussion and conclusion
Motivated by the import probability’s strong dependence on the effective distance, we implemented the import risk model based on the effective distance shortest path tree’s exit probability in combination with a random walk on the WAN. As a result, we can infer the passenger trip distribution within the traffic network of their transport vehicle (WAN). When we compare our parameter-free model to variations of established mobility models, we observe that it surpasses the alternatives in most comparison measures. The only exception is where the two parameter-fitted gravity models with effective distance perform the best. The import risk model is the most accurate in determining countries with the highest import probability and is one of the models that correlate best with the time of arrival of 20 diseases, showcasing its importance for epidemic-related problems. However, it systematically overestimates low import probabilities and its performance worsens for countries with a passenger outflow below a million per year. Despite the lack of any explicit geodesic distance information, the import risk model recovers a geodesic distance dependence. This distinction is more prominent when considering all sources to a single target compared to the reverse scenario. We attribute this phenomenon to a target’s specific attractiveness, which we estimate using its node strength, i.e. the target’s passenger outflow.
The only measure where the gravity models with effective distance outperform the import risk models is the logRMSE. This is likely due to their good agreement over wide ranges of the import probability (Fig 3I and 3J). The import risk model performs poorly with respect to the logRMSE due to its systematic overestimation of low import probabilities. Note, that the second parameter free model, the radiation model, systematically underestimates low import probabilities in the same way as the import risk model does. This is expected, since deviation from the assumptions cannot be corrected by any parameter adjustment. We identified several ways to reduce the import risk’s overestimation of low import probabilities by introducing an additional parameter that scales the population of the respective airport, changes the exit probability along the shortest path tree or only the exit probability of specific nodes (for details, see Supplementary Note C and Figs G, H in S1 Text). In conclusion, we find that introducing modifications that enhance the probability of exiting at airports or nodes with large populations mitigates the issue of overestimation. However, we leave this as a possible extension of our model and highlight that it outperformed the other models in all correlation measures, illustrating its high potential.
The radiation model’s poor performance can likely be attributed to its initial design, which focused on small-scale commuter flows driven by work opportunities [43], which shows that bottom-up approaches are often limited to their specific use case but can be adapted, such as the extended radiation model [59], which is no longer parameter-free and has similar performance to the gravity model [54]. Interestingly, the radiation model is the only one that does not improve with inclusion of flight network information via the geodesic path or the effective distance (Fig 4). The radiation model’s insensitivity to network information can be attributed to the fact that it only extracts rank information from the distance data, resulting in a significant loss of information. The rank representation has the problem that airports that directly follow in their rank with respect to a source airport could be separated by a mountain range or ocean, i.e. the rank difference is minimal but the actual distance immense. This argument holds for any distance information.
We corrected the import probability by the symmetrization of the respective OD-matrices which corresponds to a specific form of a doubly-constrained model. Normally, the constraints only ensure that the out- and inflow of each location corresponds to the observations [31, 52, 54], in contrast, we assume that both equal each other because of returning visitors. We repeated the model comparison without the correction: it reduced the agreement with the reference data for all but five of the seventy-two model-measure combinations (Fig F in S1 Text), which is in agreement with previous studies that report a better performance of doubly constrained models [54]. Importantly, the import risk model still outperforms the other models if the import probability estimates are not corrected (compare Fig 4 with Fig J in S1 Text). It’s crucial to note that the assumption of returning visitors is applicable when visitors and tourists dominate while migrants can be disregarded. However, this assumption may not hold for links between low- and high-income countries or conflict regions.
In the disease arrival time analysis, all models that use the effective distance perform similarly well, including all gravity models with power-law distance decay. The disease arrival time tA correlates with the logarithm of the estimated import probabilities, i.e. the results should be in agreement with the logcorr goodness of fit results. The models with effective distance vary only by maximal 0.07 in their logcorr measures and these are based on 183 countries as potential source (Fig J in S1 Text). However, the 20 diseases in the arrival time analysis have only 10 unique outbreak countries. Additionally, due to factors like varying testing rates between countries, the uncertainty in arrival times, and other factors, the sample size is likely insufficient to recover the logcorr results. In order to decrease the noise on tA, we repeated the analysis by extrapolating the arrival time via a logarithmic fit on the early cases, i.e. assuming an initial exponential growth (see Supplementary Note D in S1 Text). As a result of this procedure, some countries with insufficient data for extrapolation had to be excluded, which in turn led to the exclusion of more diseases. Nevertheless, the results are consistent with the tA estimation by 1st count (compare Fig 6 and Fig P in S1 Text).
We found that without providing any geodesic distance information to the import risk model, a distance dependence is recovered that is stronger for import probabilities to a single target, than from a single source, even if the import probability is computed from a source-centric view. Since the WAN is spatially embedded and has a network dimension of three [58], its connections reflect up to a certain degree the characteristics of the embedding space. This explains the import risk model’s ability to capture distance dependence in general. That distance is a better predictor in the target-centric view aligns well with a previous study in which a target-specific human-mobility model collapses mobility data to multiple targets by assigning each target a specific attractiveness that is proportional to the target’s population [51].
The import risk model predictions worsen for countries with a small outflow on the WAN, and since the country’s WAN outflow is proportional to its gross domestic product, the model performs less good for countries with a lower GDP, i.e. small population and/or low to middle income countries. This is unfortunate, as our model derives Origin-Destination (OD) information (costly to directly monitor) from cost-effective traffic flow monitoring, making it particularly valuable for regions with limited resources. However, we find that the model alternatives (gravity, radiation) also perform poorly for low-outflow countries and that the passenger distribution of the latter is most likely constrained by the GDP and thus limited to the target-regions in effective proximity. To circumvent this problem, one could aggregate neighboring low-outflow countries until the conglomerate crosses the outflow threshold of FC = 106 above which we observe a performance saturation (Fig 7 and Fig M in S1 Text). Of course, this compromise comes with a lower spatial resolution and we emphasize the need for future research in this direction.
While we have assessed the model’s performance on the world air transportation network, its applicability extends to other modes of transportation such as subway systems, cars, buses, and trains. Future research will explore the specific conditions under which this model can be effectively applied. Furthermore, there is room for improvement in the basic estimation of the traveling population within an airport’s catchment area based solely on its outflow. This estimation does not currently account for the significant role of hubs and the missing information about transit passengers. The simple framework that only relies on the traffic network is appealing, but in certain scenarios its prediction can be refined by using information about the GDP, Gini-coefficient or population density.
Our comparison focused on the parameter-free radiation model and the fitted gravity model, but we acknowledge the existence of promising variations and alternative models that were not included in this study [30, 31, 54, 59]. However, the gravity model is widely applied and has been shown to perform equally well [59] or better than alternatives [54]. There are exceptions, e.g. an iterative computation of a gravity-like model outperforms the common gravity model in cases where the complete mobility network is not available [29]. Additionally, the radiation model outperforms the gravity model for long-distance connections [38, 54]. Still, the simplicity of the gravity model and its adaptability by parameter adjustment make it a strong counterpart. The model alternatives make use of the WAN-structure information by using the effective distance as done in e.g. Ren et al. [60] where the radiation model with time-distance was better than the travel-distance on the road network to predict the traffic on each link. Similarly, we observed that the effective distance, which is related to the arrival time of diseases, outperforms geodesic path-distance in predicting import probabilities.
The import risk model is fundamentally different from classic approaches that estimate OD trips from traffic data, because the latter find the OD trips that best reproduce the traffic data [28, 30, 44, 45], while our model runs a distribution process on the traffic data network. Thus, our model is a mechanistic bottom-up approach, while the classic approaches either fit and require the knowledge of the reference trip data [28, 30] or are based on the assumption that the trip distribution across the links follows the maximum entropy principle, i.e. the OD trips are considered as most likely that can be realized by the largest number of microstates [44, 45]. Note that maximum entropy approaches require an estimation of routes and their alternatives between each OD pair, while we allow all routes to be taken by the random walker. To the best of our knowledge, our model stands as unique in its mechanistic nature, enabling the study of modifications to its underlying distribution process. This includes strategies for containment aimed at slowing or restricting a pandemic, for instance. A straight forward implementation could be the testing of a fraction of passengers Ci ≤ 1 at every transit airport i, which corresponds to reducing the probability to continue walking of an infected passenger (Eq 2) to
With C = [C1, C2, …] one could allow for a varying testing capacity between the airports.
Material and methods
Data sources
The WAN provided by OAG (Official Airline Guide) [46] contains the number of flights and the respective maximum seat capacity Fi,j between airports i and j aggregated for the year 2014. The reference import probability is based on the “Global Transnational Mobility Dataset” [40, 47] that assigns the number of trips in 2014 from country n to m worldwide by combining the world air transportation origin-final-destination data set from the company SABRE, and cross-boarder visits with an overnight stay from the UNWTO (World Tourism Organization). Thus, represents not only the mobility via air travel but also via other means (sea, road, rail). However, air travel dominates long distance trips which makes it a fair reference set of the air transportation origin-final-destination matrix. For details on how the data sets were combined, see Supplementary Note A in S1 Text.
Alternative models
The gravity model states that the number of trips between regions n and m increase with their population sizes (Nn and Nm) and decrease with distance dnm
(10) |
with f(d) as a function that grows monotonically with distance d, most often chosen as either a power-law f(d) = dγ or an exponential f(dnm) = eγd.
In the radiation model, the trips from n to m depend on their respective population sizes Nn, Nm (or other measures as job opportunities) and on the number of people smn that are in a circle with radius rmn centered around location n including Nn and Nm:
(11) |
The import probability of both models is computed by normalizing the trips with respect to the source-region
(12) |
Trip-symmetrization
We correct the import probability via symmetrizing the OD-matrix by (i) compute the estimated OD-matrix
(13) |
from the import probability estimate, (ii) correct it by computing its symmetric part
(14) |
and (iii) compute the corresponding corrected import probability via
(15) |
By going through these steps, the asymmetry is reduced heavily but still persists. Thus, we repeat steps (i) till (iii) until p(3)(A|B), which returns for all models a comparable asymmetry in mean and median to the reference data (see Supplementary Note B in S1 Text for details).
Comparison measures
We compare the import probability models with the reference data via the Pearson correlation
(16) |
with as average, the root-mean-square error
(17) |
the common part of commuters [59]
(18) |
which is 1 if all links are identical and 0 if none of them agrees. All the above measures are more sensitive to strong links, i.e. large import probabilities. However, if the focus is to get a fair comparison including all links, we are more interested in logarithmic versions of the above measures or rank correlations. Thus, we compare the logarithm of the import probabilities via correlation
(19) |
root-mean-square error
(20) |
and use the Kendall rank correlation coefficient
(21) |
with C and D as the number of concordant and discordant pairs and Tx and Ty as ties only in x and y, respectively.
To simplify and generalize the comparison we combine the six above defined measures by computing the mean rank of each model, i.e. the best correlating model has the highest (12) and the worst the lowest (0) rank and the mean rank of one model is the average of all six ranks.
To quantify the mean difference between the models we define the relative performance of one model M as
(22) |
with f(xM) = f(xM, y) as the specific comparison function and best(f(xk), k) and worst(f(xk), k) as the best and worst performing value of all models using this comparison function. Note, that best(…) = max(…) apart for the rmse-measures, where it is min(…) (analog for worst(…)).
Disease arrival times
The disease arrival time tA(i) in country i is estimated by the date of the first reported case for H1N1 and SARS-CoV-2. For the SARS-CoV-2 variants we use the first sequenced sample in this country. However, for certain variants some sequenced samples appear in the statistics month before the outbreak date declared by the WHO [61], which we treat as misclassifications, discard them and use instead the first sample after the WHO listed outbreak for the respective country (see Supplementary Note D for details and Fig O in S1 Text). For each of the diseases/variants we used the WAN that we have access to and that is closest to the respective outbreak date (see Table B in S1 Text) and as outbreak country we used the one listed by the WHO as first country with first sequenced sample of the respective variant [61]. For the H1N1 outbreak in 2009 we used the case data provided by FluNet [62, 63] (the column AH1N12009), for the COVID-19 cases we use the WHO COVID-19 dashboard [64] accessed through ourworldindata.org, the number of sequenced samples was accessed through GISAID [65–67] using the file gisaid_variants_statistics.json.
Supporting information
Acknowledgments
We acknowledge Marc Wiedermann for insightful comments.
Data Availability
The software “ImportRisk-v1.0.0” to compute the import risk is available under the Zenodo repository https://doi.org/10.5281/zenodo.7852476.
Funding Statement
B.F.M received funding through Grant CF20-0044, HOPE: How Democracies Cope with Covid-19, from the Carlsberg Foundation and was supported as an Add-On Fellow for Interdisciplinary Life Science by the Joachim Herz Stiftung. P.P.K, A.Z, F.S received funding through Grant D81870, COVID-19 Lockdown-Monitor, from Germany’s Federal Ministry of Health. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Carlier M. Number of passenger cars and commercial vehicles in use worldwide from 2006 to 2015; 2021. Available from: https://www.statista.com/statistics/281134/number-of-vehicles-in-use-worldwide/.
- 2.OECD. Container transport (indicator); 2023. Available from: https://data.oecd.org/transport/container-transport.htm.
- 3.Statista Research Department. Global air traffic—scheduled passengers 2004-2022; 2023. Available from: https://www.statista.com/statistics/564717/airline-industry-passenger-traffic-globally/.
- 4. Chapman D, Purse BV, Roy HE, Bullock JM. Global trade networks determine the distribution of invasive non-native species. Global Ecology and Biogeography. 2017;26(8):907–917. doi: 10.1111/geb.12599 [DOI] [Google Scholar]
- 5. Yashadhana A, Derbas A, Biles J, Grant J. Pandemic-related racial discrimination and its health impact among non-Indigenous racially minoritized peoples in high-income contexts: a systematic review. Health Promotion International. 2021;37(2). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Hays JN. Epidemics and pandemics: their impacts on human history. ABC-CLIO; 2005. [Google Scholar]
- 7. Daftary A, Frick M, Venkatesan N, Pai M. Fighting TB stigma: we need to apply lessons learnt from HIV activism. BMJ Global Health. 2017;2(4):e000515. doi: 10.1136/bmjgh-2017-000515 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Fineberg HV. Pandemic Preparedness and Response — Lessons from the H1N1 Influenza of 2009. New England Journal of Medicine. 2014;370(14):1335–1342. doi: 10.1056/NEJMra1208802 [DOI] [PubMed] [Google Scholar]
- 9. Fraser C, Donnelly CA, Cauchemez S, Hanage WP, Van Kerkhove MD, Hollingsworth TD, et al. Pandemic Potential of a Strain of Influenza A (H1N1): Early Findings. Science. 2009;324(5934):1557–1561. doi: 10.1126/science.1176062 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Jia JS, Lu X, Yuan Y, Xu G, Jia J, Christakis NA. Population flow drives spatio-temporal distribution of COVID-19 in China. Nature. 2020;582(7812):389–394. doi: 10.1038/s41586-020-2284-y [DOI] [PubMed] [Google Scholar]
- 11. Klamser PP, D’Andrea V, Di Lauro F, Zachariae A, Bontorin S, Di Nardo A, et al. Enhancing global preparedness during an ongoing pandemic from partial and noisy data. PNAS Nexus. 2023;2(6). doi: 10.1093/pnasnexus/pgad192 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Hadfield J, Megill C, Bell SM, Huddleston J, Potter B, Callender C, et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics. 2018;34(23):4121–4123. doi: 10.1093/bioinformatics/bty407 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Tegally H, Wilkinson E, Martin D, Moir M, Brito A, Giovanetti M, et al. Global Expansion of SARS-CoV-2 Variants of Concern: Dispersal Patterns and Influence of Air Travel. medRxiv. 2022. doi: 10.1101/2022.11.22.22282629 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Sacco PL, Arenas A, De Domenico M. The Resilience of the Multirelational Structure of Geopolitical Treaties is Critically Linked to Past Colonial World Order and Offshore Fiscal Havens. Complexity. 2023;2023:1–9. doi: 10.1155/2023/5280604 [DOI] [Google Scholar]
- 15. Kissinger H. World Order. Penguin Books; 2015. [Google Scholar]
- 16. Brockmann D, Helbing D. The Hidden Geometry of Complex, Network-Driven Contagion Phenomena. Science. 2013;342(6164):1337–1342. doi: 10.1126/science.1245200 [DOI] [PubMed] [Google Scholar]
- 17. Iannelli F, Koher A, Brockmann D, Hövel P, Sokolov IM. Effective distances for epidemics spreading on complex networks. Physical Review E. 2017;95(1):012313. doi: 10.1103/PhysRevE.95.012313 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Gautreau A, Barrat A, Barthélemy M. Arrival time statistics in global disease spread. Journal of Statistical Mechanics: Theory and Experiment. 2007;2007(09):L09001–L09001. [Google Scholar]
- 19. Gautreau A, Barrat A, Barthélemy M. Global disease spread: Statistics and estimation of arrival times. Journal of Theoretical Biology. 2008;251(3):509–522. doi: 10.1016/j.jtbi.2007.12.001 [DOI] [PubMed] [Google Scholar]
- 20. Nohara Y, Manabe T. Impact of human mobility and networking on spread of COVID-19 at the time of the 1st and 2nd epidemic waves in Japan: An effective distance approach. PLOS ONE. 2022;17(8):e0272996. doi: 10.1371/journal.pone.0272996 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Nah K, Otsuki S, Chowell G, Nishiura H. Predicting the international spread of Middle East respiratory syndrome (MERS). BMC Infectious Diseases. 2016;16(1):356. doi: 10.1186/s12879-016-1675-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Otsuki S, Nishiura H. Reduced Risk of Importing Ebola Virus Disease because of Travel Restrictions in 2014: A Retrospective Epidemiological Modeling Study. PLOS ONE. 2016;11(9):e0163418. doi: 10.1371/journal.pone.0163418 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Nah K, Mizumoto K, Miyamatsu Y, Yasuda Y, Kinoshita R, Nishiura H. Estimating risks of importation and local transmission of Zika virus infection. PeerJ. 2016;4:e1904. doi: 10.7717/peerj.1904 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Edsberg Møllgaard P, Lehmann S, Alessandretti L. Understanding components of mobility during the COVID-19 pandemic. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 2022;380 (2214). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Coelho FC, Lana RM, Cruz OG, Villela DAM, Bastos LS, Pastore y Piontti A, et al. Assessing the spread of COVID-19 in Brazil: Mobility, morbidity and social vulnerability. PLOS ONE. 2020;15(9):e0238214. doi: 10.1371/journal.pone.0238214 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Adiga A, Venkatramanan S, Schlitt J, Peddireddy A, Dickerman A, Bura A, et al. Evaluating the impact of international airline suspensions on the early global spread of COVID-19. medRxiv. 2020. doi: 10.1101/2020.02.20.20025882 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Zipf GK. The P 1 P 2 D Hypothesis: On the Intercity Movement of Persons. American Sociological Review. 1946;11(6):677. doi: 10.2307/2087063 [DOI] [Google Scholar]
- 28. Cascetta E, Nguyen S. A unified framework for estimating or updating origin/destination matrices from traffic counts. Transportation Research Part B: Methodological. 1988;22(6):437–455. doi: 10.1016/0191-2615(88)90024-0 [DOI] [Google Scholar]
- 29. Lenormand M, Huet S, Gargiulo F, Deffuant G. A Universal Model of Commuting Networks. PLoS ONE. 2012;7(10):e45985. doi: 10.1371/journal.pone.0045985 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Abrahamsson T. Estimation of Origin-Destination Matrices Using Traffic Counts—A Literature Survey. Laxenburg, Austria: IIASA; 1998. Available from: https://pure.iiasa.ac.at/id/eprint/5627/.
- 31. Barbosa H, Barthelemy M, Ghoshal G, James CR, Lenormand M, Louail T, et al. Human mobility: Models and applications. Physics Reports. 2018;734:1–74. doi: 10.1016/j.physrep.2018.01.001 [DOI] [Google Scholar]
- 32. Balcan D, Colizza V, Gonçalves B, Hu H, Ramasco JJ, Vespignani A. Multiscale mobility networks and the spatial spreading of infectious diseases. Proceedings of the National Academy of Sciences. 2009;106(51):21484–21489. doi: 10.1073/pnas.0906910106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Balcan D, Gonçalves B, Hu H, Ramasco JJ, Colizza V, Vespignani A. Modeling the spatial spread of infectious diseases: The GLobal Epidemic and Mobility computational model. Journal of Computational Science. 2010;1(3):132–145. doi: 10.1016/j.jocs.2010.07.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Tizzoni M, Bajardi P, Poletto C, Ramasco JJ, Balcan D, Gonçalves B, et al. Real-time numerical forecast of global epidemic spreading: case study of 2009 A/H1N1pdm. BMC Medicine. 2012;10(1):165. doi: 10.1186/1741-7015-10-165 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Poletto C, Gomes MF, Pastore y Piontti A, Rossi L, Bioglio L, Chao DL, et al. Assessing the impact of travel restrictions on international spread of the 2014 West African Ebola epidemic. Eurosurveillance. 2014;19(42). doi: 10.2807/1560-7917.es2014.19.42.20936 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Poletto C, Pelat C, Lévy-Bruhl D, Yazdanpanah Y, Boëlle PY, Colizza V. Assessment of the Middle East respiratory syndrome coronavirus (MERS-CoV) epidemic in the Middle East and risk of international spread using a novel maximum likelihood analysis approach. Eurosurveillance. 2014;19(23). doi: 10.2807/1560-7917.ES2014.19.23.20824 [DOI] [PubMed] [Google Scholar]
- 37. Gómez-Gardeñes J, Soriano-Paños D, Arenas A. Critical regimes driven by recurrent mobility patterns of reaction–diffusion processes in networks. Nature Physics. 2018;14(4):391–395. doi: 10.1038/s41567-017-0022-7 [DOI] [Google Scholar]
- 38. Masucci AP, Serras J, Johansson A, Batty M. Gravity versus radiation models: On the importance of scale and heterogeneity in commuting flows. Physical Review E. 2013;88(2):022812. doi: 10.1103/PhysRevE.88.022812 [DOI] [PubMed] [Google Scholar]
- 39.O’Neill S. Lufthansa Now Drives More Than Half Its Bookings Directly; 2019. Available from: https://skift.com/2019/03/14/lufthansa-now-drives-more-than-half-its-bookings-directly/.
- 40. Recchi E, Deutschmann E, Vespe M. Estimating Transnational Human Mobility on a Global Scale. SSRN Electronic Journal. 2019. doi: 10.2139/ssrn.3384000 [DOI] [Google Scholar]
- 41. Christidis P, Christodoulou A. The Predictive Capacity of Air Travel Patterns during the Global Spread of the COVID-19 Pandemic: Risk, Uncertainty and Randomness. International Journal of Environmental Research and Public Health. 2020;17(10):3356. doi: 10.3390/ijerph17103356 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Stouffer SA. Intervening Opportunities: A Theory Relating Mobility and Distance. American Sociological Review. 1940;5(6):845. doi: 10.2307/2084520 [DOI] [Google Scholar]
- 43. Simini F, González MC, Maritan A, Barabási AL. A universal model for mobility and migration patterns. Nature. 2012;484(7392):96–100. doi: 10.1038/nature10856 [DOI] [PubMed] [Google Scholar]
- 44. de Grange L, González F, Bekhor S. Path Flow and Trip Matrix Estimation Using Link Flow Density. Networks and Spatial Economics. 2017;17(1):173–195. doi: 10.1007/s11067-016-9322-1 [DOI] [Google Scholar]
- 45.Englezou Y, Timotheou S, Panayiotou CG. Estimating the Origin-Destination Matrix using link count observations from Unmanned Aerial Vehicles. In: 2021 IEEE International Intelligent Transportation Systems Conference (ITSC). IEEE; 2021. p. 3539–3544. Available from: https://ieeexplore.ieee.org/document/9564959/.
- 46.Official Airline Guide. OAG Global Airline Schedule Data; 2014. Available from: https://www.oag.com/airline-schedules-data.
- 47.Recchi E, Deutschmann E, Vespe M. Global Transnational Mobility Dataset; 2019. Available from: 10.5281/zenodo.3911054. [DOI]
- 48.Jordahl K, den Bossche JV, Fleischmann M, Wasserman J, McBride J, Gerard J, et al. geopandas/geopandas: v0.8.1; 2020. Available from: 10.5281/zenodo.3946761. [DOI]
- 49. Song C, Koren T, Wang P, Barabási AL. Modelling the scaling properties of human mobility. Nature Physics. 2010;6(10):818–823. doi: 10.1038/nphys1760 [DOI] [Google Scholar]
- 50. Brockmann D, Hufnagel L, Geisel T. The scaling laws of human travel. Nature. 2006;439(7075):462–465. doi: 10.1038/nature04292 [DOI] [PubMed] [Google Scholar]
- 51. Schläpfer M, Dong L, O’Keeffe K, Santi P, Szell M, Salat H, et al. The universal visitation law of human mobility. Nature. 2021;593(7860):522–527. doi: 10.1038/s41586-021-03480-9 [DOI] [PubMed] [Google Scholar]
- 52. Noulas A, Scellato S, Lambiotte R, Pontil M, Mascolo C. A Tale of Many Cities: Universal Patterns in Human Urban Mobility. PLoS ONE. 2012;7(5):e37027. doi: 10.1371/journal.pone.0037027 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Liang X, Zhao J, Dong L, Xu K. Unraveling the origin of exponential law in intra-urban human mobility. Scientific Reports. 2013;3(1):2983. doi: 10.1038/srep02983 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. Lenormand M, Bassolas A, Ramasco JJ. Systematic comparison of trip distribution laws and models. Journal of Transport Geography. 2016;51:158–169. doi: 10.1016/j.jtrangeo.2015.12.008 [DOI] [Google Scholar]
- 55. Pastore y Piontti A, Gomes MFDC, Samay N, Perra N, Vespignani A. The infection tree of global epidemics. Network Science. 2014;2(1):132–137. doi: 10.1017/nws.2014.5 [DOI] [Google Scholar]
- 56. Belik V, Geisel T, Brockmann D. Natural Human Mobility Patterns and Spatial Spread of Infectious Diseases. Physical Review X. 2011;1(1):011001. doi: 10.1103/PhysRevX.1.011001 [DOI] [Google Scholar]
- 57. Ciotti M, Ciccozzi M, Terrinoni A, Jiang WC, Wang CB, Bernardini S. The COVID-19 pandemic. Critical Reviews in Clinical Laboratory Sciences. 2020;57(6):365–388. doi: 10.1080/10408363.2020.1783198 [DOI] [PubMed] [Google Scholar]
- 58. Daqing L, Kosmidis K, Bunde A, Havlin S. Dimension of spatially embedded networks. Nature Physics. 2011;7(6):481–484. doi: 10.1038/nphys1932 [DOI] [Google Scholar]
- 59. Yang Y, Herrera C, Eagle N, González MC. Limits of predictability in commuting flows in the absence of data for calibration. Scientific Reports. 2014;4(1):5662. doi: 10.1038/srep05662 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Ren Y, Ercsey-Ravasz M, Wang P, González MC, Toroczkai Z. Predicting commuter flows in spatial networks using a radiation model based on temporal ranges. Nature Communications. 2014;5(1):5347. doi: 10.1038/ncomms6347 [DOI] [PubMed] [Google Scholar]
- 61.Technical Advisory Group on SARS-CoV-2 Virus Evolution. Historical working definitions and primary actions for SARS-CoV-2 variants; 2023. Available from: https://www.who.int/publications/m/item/historical-working-definitions-and-primary-actions-for-sars-cov-2-variants.
- 62. Flahault A, Dias-Ferrao V, Chaberty P, Esteves K, Valleron AJ, Lavanchy D. FluNet as a tool for global monitoring of influenza on the Web. Jama. 1998;280(15):1330–1332. doi: 10.1001/jama.280.15.1330 [DOI] [PubMed] [Google Scholar]
- 63.Geneva: World Health Organization. WHO: Global Influenza Programme—FluNet; 1997. Available from: https://www.who.int/tools/flunet.
- 64.Geneva: World Health Organization. WHO COVID-19 Dashboard; 2020. Available from: https://covid19.who.int/.
- 65. Elbe S, Buckland-Merrett G. Data, disease and diplomacy: GISAID’s innovative contribution to global health. Global challenges. 2017;1(1):33–46. doi: 10.1002/gch2.1018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Shu Y, McCauley J. GISAID: Global initiative on sharing all influenza data–from vision to reality. Eurosurveillance. 2017;22(13):30494. doi: 10.2807/1560-7917.ES.2017.22.13.30494 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Khare S, Gurry C, Freitas L, Schultz MB, Bach G, Diallo A, et al. GISAID’s role in pandemic response. China CDC Weekly. 2021;3(49):1049. doi: 10.46234/ccdcw2021.255 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Maier BF, Klamser PP, Zachariae A, Schlosser F, Brockmann D. ImportRisk-v1.0.0; 2023. Available from: 10.5281/zenodo.7852477. [DOI]
- 69.World Bank. The World Bank: GDP per capita; 2023. Available from: https://data.worldbank.org/indicator/NY.GDP.PCAP.CD.