Abstract
Twitter has been actively researched as a human mobility proxy. Tweets can contain two classes of geographical metadata: the location from which a tweet was published, and the place where the tweet is estimated to have been published. Nevertheless, Twitter also presents tweets without any geographical metadata when querying for tweets on a specific location. This study presents a methodology which includes an algorithm for estimating the geographical coordinates to tweets for which Twitter doesn't assign any. Our objective is to determine the origin and the route that a tourist followed, even if Twitter doesn't return geographically identified data. This is carried out through geographical searches of tweets inside a defined area. Once a tweet is found inside an area, but its metadata contains no explicit geographical coordinates, its coordinates are estimated by iteratively performing geographical searches, with a decreasing geographical searching radius. This algorithm was tested in two touristic villages of Madrid (Spain) and a major city in Canada. A set of tweets without geographical coordinates in these areas were found and processed. The coordinates of a subset of them were successfully estimated.
Keywords: Tourism, Twitter, Geotagged, Route, Origin, Estimating
Highlights
-
•
A metintroduce hodology to determine the origin and route followed by tourists is proposed.
-
•
It is based on a quadratic circle algorithm (QCA) and Twitter location data.
-
•
The case studies were two small villages in Spain and in a large city in Canada.
-
•
Up to 36.36% of non-geotagged tweets can be traced in the analyzed areas.
-
•
This method is a contribution to the study of the dynamics of mobility and tourism.
1. Introduction
Tourism is a source of wealth and sustained growth. The relation between tourism and economic growth has been explained in several studies [1], [2]. Until 2019, the main motivation for international travel was tourism, being the reason behind 56% of them, followed by visiting friends and relatives, health, religious and other purposes (27%), and business travel (13%). Until that time, Tourism was the world's third largest export after chemicals and fuels [3]. The international tourism expenditure has experienced an average annual growth of 6.5% from 2010 to 2018 [4]. Even in 2019 this sector grew 3.5%, contributing 10.3% to the global GDP (Gross Domestic Product) and 28.3% to global services exports [5].
Nevertheless, the COVID-19 pandemic overturned this trend. Tourism has been dramatically affected due to the mobility restrictions that governments implemented to cap the spread of the virus. This meant, in the year 2020, a worldwide loss of 100.9 millions of jobs in this industry, a decrease in its contribution to the GDP of 30% [5], the reduction of up to 60% of air travel passengers (internationally and nationally) and a decrease in the income of international tourism between 910 and 1.170 million dollars in the USA in 2020, compared to the 1.5 billion dollars generated in 2019 in the same area [6].
The situation in the tourism sector remains volatile. Thus, predicting the trends in consumption and, in particular, in tourism in the post-covid society has become a challenge for researchers [7], [8]. Tourism has been traditionally a significant form of human mobility. The analysis of the current tourist mobility patterns might play an important role in strategic tourism planning. Mobility includes human, objects, ideas and information mobility, since the movement of one of these elements can affect the other ones [9], [10], [11], and it has been a core issue in diverse fields like geography, sociology, anthropology and economy, among other fields. The study of the mobility is crucial in many areas, for instance transport planning [12], traffic prediction [13], [14]), urban planning [15], control of diseases spreading [16], [17], the management of disasters and emergencies [16], tourism and hospitality [18], [19], [20], [21], [22]. Specifically, tourist mobility is related both to work and leisure travels. This mobility is usually affected by geopolitical issues, wars, terrorism, security threats and health emergencies. These elements contribute to the interest in studying tourism mobilities and their effects [19], [23].
Social networks as Twitter constitute a valuable data source for studying the tourism and the aforementioned mobility and [24], [25], [26], [9], [27]. Twitter is one of the most popular, with 340 million users and 500 million published messages per day [25]. In particular, Twitter was selected among other possible data sources due to its free and heterogeneous nature (collecting people movements, regardless of transportation mode), ease of access to data, and being less intrusive than options such as those focused on mobile GPS data [28], [29], [30]. Tweet geotagging has been used in the study of human mobility at international scale [31], [32], national scale [33], [34], [35] and in broad time [36], [37]. Twitter data also have proven to be useful in the analysis of local mobility patterns in specific issues: college football events [38], the migration from Puerto Rico due to a hurricane [39], and the communication ecosystem during a typhoon for foreigners [40]. Twitter was used in analyzing tourism-related mobility as well. For example, to analyze the home country of tourists in Nepal [41]; to check the official visitor counts in national parks [42]; to analyze the relation between visitation rate of New York City's parks and their characteristics [43], [44]. Generally, the tweet location used in previously cited analyses was that given by Twitter, either by an accurate localization or an approximate area where the tweet could be found. Due to these limitations, a large amount of tweets were not processed.
This work has the goal to determine the origin and trace the route of travelers in a specific area. For that, an algorithm that exploits the heuristics used by Twitter to locate tweets has been used to try and increase the amount of data to process in the tracing task. The heuristics exploited include using an user's profile location to assign it to a tweet [45] which doesn't have geographical information. Thus, it becomes possible to accurately detect the trajectory followed by Twitter users using non-intrusive and voluntarily published data, as is not the case with mobile GPS. This constitutes a new contribution to the studies on mobility dynamics by building a tool that, in times of crises, such as the COVID-19 pandemic, facilitates the recovery of the tourism industry by helping in the strategic planning of the deployment of new touristic products or the adaptation of current options in response of new touristic demand. The novelty of this research lies in the implementation of a quadrant circle algorithm (QCA) in combination with Twitter location data to determine the origin and route taken by tourists.
The methodology that addresses that objective is presented in the next section. Section 3 describes the obtained results and sets a discussion line, and finally, Section 4 summarizes the main conclusions and the open lines of research that arise from the limitations found in the present work.
2. Material and methods
The criterion for choosing the studied areas and the method used for data extraction are explained below. Then, the method followed to find and filter the tweets used to build the route followed by Twitter users, and thus extract important mobility data, is described in detail.
2.1. Definition of the case of study
The proposed methodology was applied in two different kinds of settlements and two different countries: two small villages in Spain and a huge city in Canada. Spain has approximately 47 million inhabitants that are irregularly distributed across the country; the metropolitan area of its capital (Madrid) is the most densely populated. This area has been one of the Spanish regions with the highest internal freedom of movement, despite the national mobility COVID-19 restrictions. We assumed that this status maximizes the traveler-to-residents ratio in relation to the rest of the Spanish territory. For this reason, two villages located in this region were selected for the analysis: Buitrago del Lozoya and Rascafría (see Fig. 1). These two villages have been traditionally touristic villages since they have important cultural and natural heritage sites. The analysis of habitants and visitor's mobility involving these populations was performed in a period of seven days, including five labor days and two non labor days corresponding to a normal weekend. The specific dates are December 2020, 13th to 20th. It was a non-holiday week at the end of the second rise of COVID-19.
Figure 1.

Location Place 1 and Place 2 in Madrid (Spain).
In the case of Canada, a country with more than 35 million inhabitants, the city of Toronto was selected for the analysis (see Fig. 2). It is the most populous city in Canada (2,700,000 inhabitants), being the fourth most populous in North America. It is also one of the financial capitals of the world, with great commercial, cultural and sport-related activities. A big city like this one has a high potential for generating tweets. We've collected tweets in Toronto from a series of days between the 1st to the 14th of January 2021. In this period there were no COVID mobility restrictions affecting the incoming and outgoing of the city [46]. There, the vast majority of the population is packed in metropolitan areas around big cities, whereas in the rest of the country there are immense regions with a rather low or unintelligible population density.
Figure 2.

Location Place 3 in Toronto (Canada).
2.2. Twitter data compilation and processing
The tweets collected in the analyzed period were compiled together with their metadata using the Twitter Application Programming Interface (API) [45] based on a free Developer Twitter account. The processing of the tweets was done through the library Tweepy [47] of the programming language Python [48]. This library translates data and queries from the interface (model and languages) of a legacy data system to another [49]. In addition, the tracking maps are built combining Python and the folium library.
Twitter assigns geographical information to tweets by two different means. One of them is the location where the tweet itself was published from and it is represented as latitude and longitude coordinates, called point-type geotagging. This is available only when users explicitly share their location when tweeting. Data obtained this way have a high level of accuracy, nevertheless they only represent 1-2% of Twitter's tweet database. The other method of geotagging, known as bounding box, consists of a set of coordinates enclosing the area where Twitter considers the tweet could have originated. Its representation is a set of four latitude and longitude coordinates that make up a rectangle without any kind of fixed dimensions, that covers the approximate area where the tweet was published from. This means that this method is not as accurate as the previous one.
Also, from their API, it's possible to obtain tweets with a method called Timeline-based search (TL-search). This method picks the most recent tweets in the timeline of a specific user [47]. By not focusing in geographical coordinates, it allows for more data to be gathered, but at the same time a lower proportion of tweets with any geographical information related to it.
The whole procedure for the data acquisition and processing is summarized in Fig. 3a ([50], [51]). As it can be observed, four steps are taken, where methods to find users and their tweets are executed. Thus, the geosearch method (based on geographical details) is implemented in steps 1, 3 and 4, while the TL-search (based on the user timeline) is carried out in step 2. Each step is explained in more detail below:
Figure 3.
Summary of the QCA process.
-
•
Step 1 - Find users through an area-based tweets search. In order to trace users, we must first get them. The way we do this is by searching for tweets in a defined area, based on latitude, longitude and radius, and then extracting the users that posted them in a specific day. We use Twitter's search API to query for tweets within a circular area of a 5 km radius - the study area. One of the limitations of Twitter's free search is that they return tweets based on relevance and not completeness, which reduces the set of tweets to study from. Twitter considers tweets to be relevant based on their popularity - measured by retweets and replies - keywords and many other factors [52].
-
•
Step 2 - Time based tweets selection. A second selection of tweets is accomplished considering a given timeline search (TL-search). Only 7 days are considered since it is the limited period established by the Twitter API for a free Developer account. This step aims to search through step 1 users' timelines to obtain their tweets in the last seven days. If any of these tweets were explicitly geotagged (with geographic information or found in step 1), it would be added to the user's route.
-
•
Step 3 - Outward search. It performs searches in a larger radius than step 1: 1.000 km. This step allows to perform a coarser geolocation of tweets found in step 2 that correspond to those users defined in the step 1, considering wide search areas (see Fig. 4).
The aim is to establish their mobility actions in the analyzed period. This step classifies the tweets into three different groups:-
–Local tweets: tweets that were found in geosearch centered on the study area (circle with 5 km radius) but could not be located more accurately. In step 1 some may be overlooked due to the maximum number of tweets considered.
-
–Outward tweets: tweets that were found during the outward search, in any of the 1000 km radius circle areas outside of the study area. These circles are centered at predefined points throughout the study area.
-
–Non-located tweets: tweets that could not be located at all by means of geosearches in the area of study or the outer areas.
-
–
-
•
Step 4 - Inward search. This step allows to perform a fine geolocation of outward tweets that were identified in the previous step. It applies the quadrant-circles algorithm (QCA) (see Fig. 3b). As illustrated in Fig. 5, it consists in an iterative searching process: once a tweet is found in a circle of step 3, the QCA repeats the search in subsequent smaller circles and covering all quadrants of each circle, until the tweet is not found any more in geosearches. The search is gradually refined until certain stopping criteria are met (in this case, a target radius and a limit of movements defined by k). The geographical coordinates estimated with higher accuracy of the smallest circle in which the tweet was found, together with the reached radius, are added to the route of the user who published the tweet.
Figure 4.
Process followed in step 3.
Figure 5.
Process followed in step 4.
3. Results and discussion
In this section the results obtained by applying the presented algorithm in two villages in Madrid, on the one hand, and in a city in Canada, on the other hand, are analyzed. In this way, the effectiveness of the QCA algorithm in processing the tweets found in the outward search and defining the movement of users is tested. The performance of the algorithm is measured according to the average radius reached (accuracy) and the average searches required (computational cost).
3.1. Regions of Madrid's case study
After executing the procedure in Place 1, a total of 199 tweets were found belonging to 16 users. These users were used to feed step 2 to perform the timeline search on them, of which results are shown in Table 1. Step 2 collected a total of 87 tweets with about half of them being geolocalized (49.42%). Out of the 87 tweets from step 2, those that were not geographically tagged (50.57%), were used to feed step 3 and try our proposed algorithm to locate them. Its results are shown in the rightmost column of the Table 2 and show that up to 40.91% of them were located inside of the study area, but its location could not be determined more accurately. A 15.91% of them were found during the outward search and the remaining 43.18% could not be located at all. In this way, we increased the number of tweets found by 25 (57% of the 44 initially non-geotagged); 18 local and 7 outwards.
Table 1.
Metrics obtained in the first steps in Place 1.
| Step 1 |
Step 2 |
|||
|---|---|---|---|---|
| No | % | No | % | |
| Tweets found | 199 | 100.00 | 87 | 100.00 |
| Number of users | 16 | 16 | ||
| Tweets per user (x) | 12.44 | 5.4 | ||
| Geotagged tweets | 43 | 49.42 | ||
| Non-geotagged tweets | 44 | 50.57 | ||
Table 2.
Tweet counting after performing step 3 in relation to the total tweets in step 2, for Place 1.
| Step 3 |
||
|---|---|---|
| No | % | |
| Non-geotagged tweets obtained in step 2 | 44 | 100.00 |
| Local tweets | 18 | 40.91 |
| Outward tweets | 7 | 15.91 |
| Non located tweets | 19 | 43.18 |
Similarly, Table 3 and Table 4 shows the results for place 2. A total of 156 tweets from 18 users were collected at step 1. Those users were fed into step 2 to find the tweets in their timeline and got 96 new tweets, of which 31.25% were already geotagged. The remaining non-located tweets were used to feed step 3 to get the following results: only 1.51% of them were successfully found inside of the study area, 36.36% were found in any of the 1000 km radius areas in the outward search and the rest, representing a 62.12% of non-located tweets, remained unable to locate with geographic coordinates (see Table 4). In this case, the method allows locating 25 tweets more than with a basic search (38% of those initially non-geotagged); 1 local and 24 outward.
Table 3.
Metrics obtained in the first steps in Place 2.
| Step 1 |
Step 2 |
|||
|---|---|---|---|---|
| No | % | No | % | |
| Tweets found | 156 | 100.00 | 96 | 100.00 |
| Number of users | 18 | 18 | ||
| Tweets per user (x) | 8.67 | 5.3 | ||
| Geotagged tweets | 30 | 31.25 | ||
| Non-geotagged tweets | 66 | 68.75 | ||
Table 4.
Tweet counting after performing step 3 in relation to the total tweets in step 2, for Place 2.
| Step 3 |
||
|---|---|---|
| No | % | |
| Non-geotagged tweets obtained in step 2 | 66 | 100.00 |
| Local tweets | 1 | 1.51 |
| Outward tweets | 24 | 36.36 |
| Non located tweets | 41 | 62.12 |
At step 4, the QCA algorithm processed all tweets located outward at step 3 (15.91% and 36.36% in places 1 and 2 respectively) and managed to approximate their geographical coordinates to relatively smaller areas. Efficacy of step 4 is shown in Table 5, where one can see that the QCA algorithm reduced the starting 1000 km radius where a tweet was first seen to a, on average, substantially smaller radius. Namely, 33.5 km and 5.5 km for places 1 and 2 respectively. As it's seen, results in Spain and its neighboring regions show promising results for an algorithm that takes minutes to execute.
Table 5.
Average metrics obtained in the QCA.
| Place 1 | Place 2 | |
|---|---|---|
| Tweets successfully processed | 100.00 | 100.00 |
| Radius reached (km) | 33.50 | 5.50 |
| Number of searches required | 19.00 | 7.00 |
Once the QCA algorithm processes all the eligible tweets, the centers of the areas obtained by the algorithm are used to estimate the location of any user and create tracks of their mobility. The refinement process inside of the QCA can be appreciated in Fig. 6. The red line connects all the consecutive searches. The markers are at the center of the circles in which the tweet was found.
Figure 6.

Refinement of the circles during QCA. Map layer: Open Street Map through the python library Geopy.
As example, Fig. 7 shows two tracks of users created by our analyses. As seen in the legend, all kinds of tweets we have been dealing with are identified: tweets located by Twitter with point-type geotagging and tweets located by our algorithm with local labels and QCA tweets. On top of that information, we found an approximate area where the tweet was found based on the results of step 3 and 4: local and QCA tweets respectively. In particular, the black circles in Fig. 7 represent the result of applying the QCA algorithm to previously non-located tweets and reducing their area of appearance to a relatively narrow space. Due to the technical limitations established by Twitter and due to the mobility restrictions by the pandemic at the time of this study, most of detected displacements took place in a weekend (Fig. 7), which is in accordance with the touristic attractiveness of Place 1 (Fig. 7a) and Place 2 (Fig. 7b).
Figure 7.
Examples of tracks built.
3.2. Toronto' s case study
Results obtained from the analysis performed on the city of Toronto are shown in Table 6. In step 1, 2750 different tweets were collected from Twitter's free API from a total of 500 users. After feeding step 2 with the users from step 1 to search for more tweets in their timeline, we obtained a proportion of geotagged tweets of 46% from the 2506 tweets found. Out of the non-located tweets, 54% of the total obtained from step 2, 48.7% were determined to be local tweets, 42.78% could not be located and the remaining 8.33% were found in the outward search. The results of the whole process, plus the analysis on the effectiveness of steps 3 to locate non-geotagged tweets is described in Table 6 and Table 7. In Table 6 it can be observed that 56.06% of the tweets non-geotagged tweets in step 2 were found through step 3 (772 tweets).
Table 6.
Metrics obtained in the first steps in Toronto.
| Step 2 |
Step 3 | |||
|---|---|---|---|---|
| Non-geotagged tweets | ||||
| No | % | No | % | |
| Tweets | 2506 | Local | 659 | 48.71 |
| Number of users | 500 | Outward | 113 | 8.35 |
| Tweets per user (x) | 5.10 | Not located | 581 | 42.94 |
| Geotagged tweets | 1153 | 46.01 | ||
| Non-geotagged tweets | 1353 | 53.99 | ||
Table 7.
Average metrics obtained in the QCA.
| Place 3 | |
|---|---|
| Tweets successfully processed (%) | 100.00 |
| Radius reached (km) | 375.10 |
| Number of searches required (x) | 9.90 |
On the results obtained from step 4, the QCA algorithm, it's observed that a much greater radius was reached on average, 375.1 km, compared to places 1 and 2. This fact might be accrued to the fact that the outward search was now performed on the area of North America, focusing more on the greater and less densely populated area of Canada, in comparison to the relatively dense and small areas studied in Spain (Europe). As in the previous case, the algorithm is implemented in an order of minutes.
The lower amount of searches required, roughly 10, to locate a tweet may be explained by the greater radius required on average to locate a tweet, since less iterations of the QCA were performed before it stopped from the 1000 km start point. As in the cases of Madrid, the QCA algorithm processes all of the outward tweets, to reduce their area of appearance and assign the center of the smallest circle where it is located. Examples of the presented process are shown for two specific Twitter users in Fig. 8a and Fig. 8b. In Fig. 8, the black and blue circles represent the result of applying the QCA algorithm to previously non-located tweets and reducing their area of appearance to a relatively narrow space.
Figure 8.
Examples of tracks built for two specific users.
3.3. Discussion
The developed algorithm exploits Twitter's functionalities to find a greater number of tweets with geographical location than methods that relied only on point geotagged and bounding box. They are methods, used previously by researchers to analyze mobility [33], [31], [41], given that they offer positional coordinates of tweets - very accurate in the former and approximates in the latter -, but with the limitations on the amount of tweets available. Thus, by the end of the procedure, each geolocalized tweet can have one of these kinds of geographical tags: point geotagged, bounding box, local tweet and QCA tweet. The first two are available from Twitter, the second two are products of using our algorithm.
The last two geographical tags are classified depending on how they were found. On the one hand, local tweets are those found inside of the study area that lack a more accurate type of geotagging, therefore being assigned with the coordinates of that area. On the other hand, QCA tweets are those for which coordinates have been assigned by the QCA algorithm and of which its search area has been reduced from the initially wider area obtained in the outward search, thus improving accuracy. This greater number of tweets, localized at steps 3 and 4, allow us to take on studies about mobility with a greater range of tweets - not only those that Twitter can offer - and higher resolution in the resulting tracks of users. This last point allows researchers to conduct better analyses of mobility within a period of time.
The results of our analyses show the possibility to exploit a greater and more diverse dataset in a study area using data from the social network Twitter, enabling researchers to know with greater accuracy how many users enter that area, the place from where they are coming and when do their entries or exits happen. On the one hand, the contribution of the presented method lies in the fact that previous similar studies focus on analyzing only differences by age, ethnicity or other groups of people [53], static estimation without creating roadmaps [54], [55], characterizing tourist locations [56] or rely on data that are more difficult or costly to obtain [57]. The proposed algorithm is capable of extracting not only the static position but also the path followed by Twitter users within minutes. On the other hand, the limitations shown by the presented methodology lie in the restrictions related to consider only Twitter users and with the limitations of the Twitter API.
4. Conclusions
In this article we have defined an algorithm to detect mobility patterns with data from Twitter. We have applied the algorithm to the information extracted from it in three places: two small villages in Madrid and a huge city in Canada. In this study we've observed that, based on twitter location data, the presented algorithm is able to identify the approximate coordinates of specific tweets and offer a track of the users even if their tweets were not geotagged from the beginning. Effectively, this algorithm allows us to exploit Twitter's heuristics to assign geographic location to tweets that are not localized, narrow down the possible areas where these have been published and assemble traces with more data than normally available. Specifically, we found that 15.91%, 36.36% and 8.33% of tweets not explicitly geotagged in the analyzed places were subject to enrichments in terms of geographic information through the presented algorithm. On top of that, we also found that it's accuracy is subject to the characteristics of the study area, having performed better in more densely populated areas where, as in place 2, the results of the QCA algorithm can be reduced to 5.5 km.
The application of this algorithm can constitute a strategic tool in the touristic industry, which is, currently, stuck at global level due to the COVID-19 pandemic. The behavior patterns of travelers will be modified according to variables such as feeling of safety, the desire of visiting places with light touristic flow or the search for Tourism of Experiences. The detection of human mobility becomes a core task to plan a strategy for promoting new touristic products or for adapting the existing ones to the new needs of travelers. Thus, this work contributes to managing the crisis in tourism and is a useful tool for the strategic planning of new and adapted touristic products. The usage of data from social networks provides useful knowledge to understand human mobility.
In addition to the restriction to Twitter users, the main limitation of this study was the amount of data available. Thus, only published data is considered from Twitter, and only seven days of data can be accessed with a free developer account (as used in this study). A more extensive data sample, on the one hand, would make it possible to analyze more areas obtaining different conclusions depending on the geographical differences and the different types of urban centers considered. In this case, the applicability of the algorithm could be increased. On the other hand, data from a longer period of time would allow measuring the effect of certain events (COVID, wars, economic crises, etc) on the mobility of people or the difference between seasons. The outcomes of this research evidence that the developed methodology can contribute to improve the study of the routes followed by tourists and, thus, the strategic planning to satisfy their needs. With a simple technique and using easily obtained data, it is demonstrated that the route followed by people who post tweets, although they have not been geotagged, can be detected through the methodology presented. Moreover, the goodness of this algorithm for estimating the geolocation of tweets, the QCA, has room for improvement in two aspects. As future lines of research, the test sample could be expanded to more areas in order to achieve more consistent results and ways to make the algorithm more accurate can be studied. The first way is based on considering the intersection of the circles in which a tweet is found. Theoretically, the tweet was published in the intersection of those circles. This would permit to further decrease the area assigned to the tweet. The second suggested improvement is to replace the quadrant-circles algorithm with a ring algorithm. Once a tweet is found in the outward search with a 1000 km-radius circle, the search would be iterated in the same center with decreasing radius. It's an algorithm that would reduce the search circle until the tweet disappears, and once it does, it would start a new search in adjoining circular areas around the perimeter obtained, thus improving the accuracy of the algorithm. It's about trying to reduce the search space until the tweet stops appearing and, once it does, to traverse the perimeter of the resulting circle to try to find the tweet again in any of the adjoining circles, improving the accuracy of the algorithm.
CRediT authorship contribution statement
Pilar Muñoz-Dueñas; Javier Martínez-Torres: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data.
Miguel Martínez-Comesaña: Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.
Guillermo Bastos-Costas: Conceived and designed the experiments; Performed the experiments; Wrote the paper.
Declaration of Competing Interest
The authors declare no conflicts of interest.
Acknowledgements
This research was partially supported by the Ministry of Science, Innovation and Universities of the Spanish government under the RETOS project (PID2020-116040RB-I00) and by the European Union (EU) under the EAPA project (EAPA_744/2018 Atlantic CultureScape). The authors also want to thank the Ministry of Science, Innovation and Universities, Spain (grant FPU19/01187). Funding for open access charge: Universidade de Vigo/CISUG, Spain.
Data availability
Data will be made available on request.
References
- 1.London S., Rojas M.L., Candias K.N. Turismo sostenible: un modelo de crecimiento con recursos naturales. Ens. Econ. 2021;31(58):158–177. doi: 10.15446/ede.v31n58.88712. [DOI] [Google Scholar]
- 2.Cortés-Jiménez I. Which type of tourism matters to the regional economic growth? The cases of Spain and Italy. Int. J. Tour. Res. 2008;10(2):127–139. doi: 10.1002/jtr.646. https://onlinelibrary.wiley.com/doi/abs/10.1002/jtr.646 [DOI] [Google Scholar]
- 3.WTU Organization International tourism highlights. 2019. https://doi.org/10.18111/9789284421152 2019th edition.
- 4.UNWTO UNWTO global tourism dashboard. Country profile–outbound. 2022. https://www.unwto.org/country-profile-outbound-tourism
- 5.WT Council Tourism Research – economic impact reports. 2022. https://wttc.org/Research/Economic-Impact
- 6.Bureau A.T. Effects of novel coronavirus (COVID-19) on civil aviation: economic impact analysis. ICAO J. 2020 [Google Scholar]
- 7.Gössling S., Scott D., Hall C.M. Pandemics, tourism and global change: a rapid assessment of COVID-19. J. Sustain. Tour. 2021;29(1):1–20. doi: 10.1080/09669582.2020.1758708. [DOI] [Google Scholar]
- 8.Jamal T., Budke C. Tourism in a world with pandemics: local-global responsibility and action. J. Tour. Futures. 2020 [Google Scholar]
- 9.García-Palomares J.C., Gutiérrez J., Mínguez C. Identification of tourist hot spots based on social networks: a comparative analysis of European metropolises using photo-sharing services and gis. Appl. Geogr. 2015;63:408–417. doi: 10.1016/j.apgeog.2015.08.002. https://www.sciencedirect.com/science/article/pii/S0143622815001952 [DOI] [Google Scholar]
- 10.Cresswell T. Mobilities I: catching up. Prog. Hum. Geogr. 2010;35(4):550–558. doi: 10.1177/0309132510383348. [DOI] [Google Scholar]
- 11.Sheller M. The new mobilities paradigm for a live sociology. Curr. Sociol. 2014;62(6):789–811. doi: 10.1177/0011392114533211. [DOI] [Google Scholar]
- 12.Sheller M., Urry J. The city and the car. Int. J. Urban Reg. Res. 2000;24(4):737–757. doi: 10.1111/1468-2427.00276. [DOI] [Google Scholar]
- 13.Liu Y., Wang F., Xiao Y., Gao S. Urban land uses and traffic ‘source-sink areas’: evidence from GPS-enabled taxi data in Shanghai. Landsc. Urban Plan. 2012;106(1):73–87. doi: 10.1016/j.landurbplan.2012.02.012. https://www.sciencedirect.com/science/article/pii/S0169204612000631 [DOI] [Google Scholar]
- 14.Peng C., Jin X., Wong K.-C., Shi M., Liò P. Collective human mobility pattern from taxi trips in urban area. PLoS ONE. 2012;7(4) doi: 10.1371/journal.pone.0034487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Frias-Martinez V., Frias-Martinez E. Spectral clustering for sensing urban land use using Twitter activity. Eng. Appl. Artif. Intell. 2014;35:237–245. doi: 10.1016/j.engappai.2014.06.019. https://www.sciencedirect.com/science/article/pii/S0952197614001419 [DOI] [Google Scholar]
- 16.Shahid F., Ony S.H., Albi T.R., Chellappan S., Vashistha A., Islam A.B.M.A.A. Learning from tweets: opportunities and challenges to inform policy making during Dengue epidemic. Proc. ACM Hum.-Comput. Interact. May 2020;4(CSCW1) doi: 10.1145/3392875. [DOI] [Google Scholar]
- 17.Signorini A., Segre A.M., Polgreen P.M. The use of Twitter to track levels of disease activity and public concern in the U.S. during the influenza A H1N1 pandemic. PLoS ONE. 2011;6(5) doi: 10.1371/journal.pone.0019467. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Gibson S. Routledge; 2016. Mobilizing Hospitality: The Ethics of Social Relations in a Mobile World. [Google Scholar]
- 19.Hannam K., Butler G., Paris C.M. Developments and key issues in tourism mobilities. Ann. Tour. Res. 2014;44:171–185. doi: 10.1016/j.annals.2013.09.010. https://www.sciencedirect.com/science/article/pii/S016073831300131X [DOI] [Google Scholar]
- 20.Haldrup M., Larsen J. Routledge; 2009. Tourism, Performance and the Everyday: Consuming the Orient. [Google Scholar]
- 21.Urry J., Larsen J. SAGE Publications Ltd; 2011. The Tourist Gaze 3.0.https://uk.sagepub.com/en-gb/eur/the-tourist-gaze-30/book234297 [Google Scholar]
- 22.Urry J. SAGE Publications Ltd; 1990. The Tourist Gaze: Leisure and Travel in Contemporary Societies.https://books.google.es/books/about/The_Tourist_Gaze.html?id=-t9-AAAAMAAJ&redir_esc=y [Google Scholar]
- 23.King R., Christou A. Cultural geographies of counter-diasporic migration: perspectives from the study of second-generation ‘returnees’ to Greece. Popul. Space Place. 2010;16(2):103–119. doi: 10.1002/psp.543. [DOI] [Google Scholar]
- 24.Luo F., Cao G., Mulligan K., Li X. Explore spatiotemporal and demographic characteristics of human mobility via Twitter: a case study of Chicago. Appl. Geogr. 2016;70:11–25. doi: 10.1016/j.apgeog.2016.03.001. https://www.sciencedirect.com/science/article/pii/S0143622816300194 [DOI] [Google Scholar]
- 25.Agency O. Twitter by the numbers: stats, demographics and fun facts. 2021. https://www.omnicoreagency.com/twitter-statistics
- 26.Su S., Wan C., Hu Y., Cai Z. Characterizing geographical preferences of international tourists and the local influential factors in China using geo-tagged photos on social media. Appl. Geogr. 2016;73:26–37. doi: 10.1016/j.apgeog.2016.06.001. https://www.sciencedirect.com/science/article/pii/S014362281630131X [DOI] [Google Scholar]
- 27.Papapicco C., Mininni G. Twitter culture: irony comes faster than tourist mobility. J. Tour. Cult. Change. 2020;18(5):545–556. doi: 10.1080/14766825.2019.1611839. [DOI] [Google Scholar]
- 28.Li X., Xu H., Huang X., Guo C.A., Kang Y., Ye X. Emerging geo-data sources to reveal human mobility dynamics during Covid-19 pandemic: opportunities and challenges. Comput. Urban Sci. 2021;1(1):22. doi: 10.1007/s43762-021-00022-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Carvalho A.M., Ferreira M.C., Dias T.G. Understanding mobility patterns and user activities from geo-tagged social networks data. 23rd EURO Working Group on Transportation Meeting, EWGT 2020; Paphos, Cyprus, 16–18 September 2020; 2021. pp. 493–500. [DOI] [Google Scholar]
- 30.Dredze M., García-Herranz M., Rutherford A., Mann G. Twitter as a source of global mobility patterns for social good. 2016. https://doi.org/10.48550/ARXIV.1606.06343https://arxiv.org/abs/1606.06343
- 31.Provenzano D., Hawelka B., Baggio R. The mobility network of European tourists: a longitudinal study and a comparison with geo-located Twitter data. Tour. Rev. 2018 [Google Scholar]
- 32.Hawelka B., Sitko I., Beinat E., Sobolevsky S., Kazakopoulos P., Ratti C. Geo-located Twitter as proxy for global mobility patterns. Cartogr. Geogr. Inf. Sci. 2014;41(3):260–271. doi: 10.1080/15230406.2014.890072. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Brogueira G., Batista F., Carvalho J.P. Using geolocated tweets for characterization of Twitter in Portugal and the Portuguese administrative regions. Soc. Netw. Anal. Min. 2016;6(1):37. doi: 10.1007/s13278-016-0347-8. [DOI] [Google Scholar]
- 34.Jurdak R., Zhao K., Liu J., AbouJaoude M., Cameron M., Newth D. Understanding human mobility from Twitter. PLoS ONE. 2015;10(7):1–16. doi: 10.1371/journal.pone.0131469. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Mendieta J., Suárez S., Vaca C., Ochoa D., Vergara C. 2016 Third International Conference on eDemocracy eGovernment (ICEDEG) 2016. Geo-localized social media data to improve characterization of international travelers; pp. 126–132. [DOI] [Google Scholar]
- 36.Bassolas A., Lenormand M., Tugores A., Gonçalves B., Ramasco J.J. Touristic site attractiveness seen through Twitter. EPJ Data Sci. 2016;5(1) doi: 10.1140/epjds/s13688-016-0073-5. [DOI] [Google Scholar]
- 37.Béjar J., Álvarez S., García D., Gómez I., Oliva L., Tejeda A., Vázquez-Salceda J. Discovery of spatio-temporal patterns from location-based social networks. J. Exp. Theor. Artif. Intell. 2016;28(1–2):313–329. doi: 10.1080/0952813X.2015.1024492. [DOI] [Google Scholar]
- 38.Xin Y., MacEachren A.M. Characterizing traveling fans: a workflow for event-oriented travel pattern analysis using Twitter data. Int. J. Geogr. Inf. Sci. 2020;34(12):2497–2516. doi: 10.1080/13658816.2020.1770259. [DOI] [Google Scholar]
- 39.Martín Y., Cutter S.L., Li Z., Emrich C.T., Mitchell J.T. Using geotagged tweets to track population movements to and from Puerto Rico after Hurricane Maria. Popul. Environ. 2020;42(1):4–27. doi: 10.1007/s11111-020-00338-6. [DOI] [Google Scholar]
- 40.Sakurai M., Adu-Gyamfi B. Disaster-resilient communication ecosystem in an inclusive society – a case of foreigners in Japan. Int. J. Disaster Risk Reduct. 2020;51 doi: 10.1016/j.ijdrr.2020.101804. https://www.sciencedirect.com/science/article/pii/S2212420920313066 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Devkota B., Miyazaki H. 2018 IEEE 3rd International Conference on Computing, Communication and Security (ICCCS) 2018. An exploratory study on the generation and distribution of geotagged tweets in Nepal; pp. 70–76. [DOI] [Google Scholar]
- 42.Tenkanen H., Di Minin E., Heikinheimo V., Hausmann A., Herbst M., Kajala L., Toivonen T. Instagram, Flickr, or Twitter: assessing the usability of social media data for visitor monitoring in protected areas. Sci. Rep. 2017;7(1) doi: 10.1038/s41598-017-18007-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Hamstead Z.A., Fisher D., Ilieva R.T., Wood S.A., McPhearson T., Kremer P. Geolocated social media as a rapid indicator of park visitation and equitable park access. Comput. Environ. Urban Syst. 2018;72:38–50. doi: 10.1016/j.compenvurbsys.2018.01.007. https://www.sciencedirect.com/science/article/pii/S0198971517303538 [DOI] [Google Scholar]
- 44.Teles da Mota V., Pickering C. Using social media to assess nature-based tourism: current research and future trends. J. Outdoor Recreat. Tour. 2020;30 doi: 10.1016/j.jort.2020.100295. https://www.sciencedirect.com/science/article/pii/S2213078020300190 [DOI] [Google Scholar]
- 45.Twitter Developer documentation. Search tweets. 2022. https://developer.twitter.com/en/docs/twitter-api/v1/tweets/search/overview/standard
- 46.Toronto City of Toronto update on COVID19. 2020. https://www.toronto.ca/news/city-of-toronto-update-on-covid-19-112/
- 47.Roesslein J. Tweepy documentation. 2022. http://docs.tweepy.org/en/latest/#
- 48.Van Rossum G., Drake F.L., Jr . Centrum voor Wiskunde en Informatica; Amsterdam: 1995. Python Reference Manual.https://www.python.org/ [Google Scholar]
- 49.Thiran P., Risch T., Costilla C., Henrard J., Kabisch T., Petrini J., van den Heuvel W.-J., Hainaut J.-L. Report on the workshop on wrapper techniques for legacy data systems. SIGMOD Rec. 2005;34(3):85–86. doi: 10.1145/1084805.1084824. [DOI] [Google Scholar]
- 50.D-maps D-maps. https://d-maps.com/carte.php?num_car=2199&lang=es
- 51.Google https://www.mapdevelopers.com/draw-circle-tool.php Google map developers.
- 52.Twitter Search result FAQs. 2022. https://help.twitter.com/en/using-twitter/top-search-results-faqs
- 53.Comito C., Falcone D., Talia D. Mining human mobility patterns from social geo-tagged data. Pervasive Mob. Comput. 2016;33:91–107. doi: 10.1016/j.pmcj.2016.06.005. https://www.sciencedirect.com/science/article/pii/S1574119216300700 [DOI] [Google Scholar]
- 54.Dutt F., Das S. Fine-grained geolocation prediction of tweets with human machine collaboration. 2021. https://doi.org/10.48550/ARXIV.2106.13411https://arxiv.org/abs/2106.13411
- 55.Huang B., Carley K.M. A hierarchical location prediction neural network for Twitter user geolocation. 2019. https://doi.org/10.48550/ARXIV.1910.12941https://arxiv.org/abs/1910.12941
- 56.Maeda T.N., Yoshida M., Toriumi F., Ohashi H. Extraction of tourist destinations and comparative analysis of preferences between foreign tourists and domestic tourists on the basis of geotagged social media data. ISPRS Int.l J. Geo-Inf. 2018;7(3) doi: 10.3390/ijgi7030099. https://www.mdpi.com/2220-9964/7/3/99 [DOI] [Google Scholar]
- 57.Kovács Z., Vida G., Elekes Á., Kovalcsik T. Combining social media and mobile positioning data in the analysis of tourist flows: a case study from Szeged, Hungary. Sustainability. 2021;13(5) doi: 10.3390/su13052926. https://www.mdpi.com/2071-1050/13/5/2926 [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data will be made available on request.





