Skip to main content
PLOS One logoLink to PLOS One
. 2020 Apr 9;15(4):e0228175. doi: 10.1371/journal.pone.0228175

Measuring multi-spatiotemporal scale tourist destination popularity based on text granular computing

Chi Yunxian 1,2, Li Renjie 1,2,*, Zhao Shuliang 3, Guo Fenghua 4
Editor: Song Gao5
PMCID: PMC7145151  PMID: 32271763

Abstract

User-generated content (UGC) is an important data source for tourism GIScience research. However, no effective approach exists for identifying hidden spatiotemporal patterns within multi-scale unstructured UGC. Therefore, we developed an algorithm to measure the tourist destination popularity (TDP) based on a multi-spatiotemporal text granular computing model, called TDPMTGC. To accurately granulate the spatial and temporal information of tourism text, tourism text data granules are used to represent landscape objects. These granules are unified objects that possess multiple attributes, such as spatial and temporal dimensions. The multi-spatiotemporal scales are characterized by the multi-hierarchical structure of granular computing, and transformations of granular layers and data granule size are achieved by scale selection in the spatial and temporal dimensions. Therefore, all scales between the spatial and temporal dimension are related, which allows for the comparability of the data granules of all spatial-spatial, temporal-temporal and spatial-temporal layers. This approach achieves a quantitative description and comparison of the popularity value of granules between adjacent scales and cross-scales. Therefore, the TDP with multi-spatiotemporal scales can be deduced and calculated in a systematic framework. We first introduce the conceptual framework of TDPMTGC to construct a quantitative measurement model of TDP at multi-spatiotemporal scales. Then, we present a dataset construction approach to support multi-spatiotemporal scale granular reorganization. Finally, TDPMTGC is derived to describe both the TDP at a single spatial or temporal scale and the patterns and processes of the TDP at multi-spatiotemporal scales. A case study from Jiuzhaigou shows that the TDP derived using TDPMTGC is consistent with the conclusions of existing studies. More importantly, TDPMTGC provides additional detailed characteristics, such as the contributions of different scenic spots in a tourist route or scenic area, the monthly anomalies and daily contributions of TDP in a specific year, the distinct weakening of tourist route scale in tourist cognition, and the daily variations of TDP during in-season and off-season times. This is the first time that a granular computing model has been introduced to tourism GIScience that provides a feasible scheme for reorganizing large-scale unstructured text and constructing public spatiotemporal UGC tourism datasets. TDPMTGC constitutes a new approach for exploring tourist behaviors and the driving mechanisms of tourism patterns and processes.

1 Introduction

As an important part of geographical information science (GIScience [1]), tourism GIScience mainly studies a series of basic problems involved in processing, storing, extraction, management and analysis of tourism geographic information with computer technology. Tourist destination popularity (TDP), which refers to the tourists’ attention to tourist destinations, is a popular issue in tourism GIScience research that can be expressed through the number of visitors [24], an index related to online searches and evaluations, and the user-generated content (UGC) published by tourists [57]. TDP is closely related to tourists’ perceptions, preferences, and behaviors, which are critical to local tourism development because they provide important insights beyond the physical attributes of the landscape in a tourist destination and reflect its social significance [8]. By exploring the spatiotemporal characteristics [911] and evolutionary patterns of tourism destinations [1213], researchers can analyze the influences of tourist perception [14], tourist satisfaction [15] and tourist spatiotemporal behaviors.

The focus of geographic spatiotemporal data mining is to study effective technical methods of exploring spatiotemporal data to support mining interesting patterns, anomalies and relationships within the data in the temporal and spatial dimensions [16]. While questionnaires are the traditional data acquisition approach for TDP [1719], this approach has some deficiencies, such as small sample sizes, and it is difficult to guarantee the quality of survey results, which causes deviations in data analysis results [20]. Tourism big data provides a new solution to the above problems [21]. TDP can be analyzed using logs containing navigation, check-ins, and mobile positioning information. However, the semantic connotations of the TDP cannot be extracted due to the lack of content description. In the past decade, the quantity of information and the number of users on the Internet have increased tremendously; consequently, UGC is increasingly spreading through social networks. The multi-scale unstructured text-UGC usually contains spatiotemporal semantics; therefore, tourists can access this valuable information to choose tourist destinations or make travel plans. However, due to the explosive growth in the scale of such data, users must spend considerable time evaluating and extracting the collected information [2223]. Therefore, mining knowledge from multi-scale unstructured UGC has become a popular research topic in various fields [2427]. Scale is an important concept in geography [2829]. Although the previous popularity analysis methods of tourist destination made multi-scale divisions on the temporal scale, they regarded a tourist destination as an integral unit on the spatial scale and often ignored its internal spatial characteristics, which affected the precision of the method. In-depth analysis of the spatiotemporal characteristics between scales helps improve model precision. However, establishing an accurate relationship between text and spatial units of different scales and integrating multi-spatial and multi-temporal scales into a systematic model are still obstacles in the study of tourism GIScience.

The granular computing (GrC) model can address the concept of scale well [28,30]. The GrC model divides the research object into several layers with different granularities, and each layer is interrelated to form one unified unit. Fine-grained information can support fine-scale descriptions of scenic spots’ popularity, location and spatiotemporal patterns, and the geographical laws of the larger scales can be analyzed by enlarging the granularity. We propose a model named tourist destination popularity measurement based on text granular computing (TDPMTGC). A GrC model is used to reconstruct the UGC data, granulate the tourism text [31], quantitatively describe the TDP, analyze the coupling relationship between different spatial and temporal scales [20], solve large-scale text data processing and complex problems [3233], and describe the multi-scale geographic patterns and processes [3435]. Determining the spatiotemporal behavior rules of tourist groups and analyzing the spatial patterns and driving mechanisms of TDP are both topics of considerable interest. TDPMTGC is extensible and can be applied to existing approaches or models to improve their detail. To accurately granulate the spatial and temporal information of tourism text, a tourism text data granule is used to represent a landscape object, which is a unified whole that possesses multiple attributes, such as spatial and temporal dimensions. The multi-spatiotemporal scales are characterized by the multi-hierarchical structure of GrC, and the transformations of granular layers and data granule size are realized by the scale selection in spatial and temporal dimensions. Therefore, all scales between the spatial and temporal dimension are related, which allows for the comparability of the data granules of all spatial-spatial, temporal-temporal and spatial-temporal layers. This approach achieves a quantitative description and comparison of the popularity value of granules between adjacent scales and cross-scales. Therefore, the TDP with multi-spatiotemporal scales can be calculated in a systematic framework. Thus, we can gain unique information by applying TDPMTGC to texts that cannot be obtained via other, possibly simpler, approaches (e.g., simply counting the number of visitors or the number of social media posts).

The main contributions of this paper to tourism GIScience are as follows. (1) We introduce the GrC model into tourism geography through the TDPMTGC algorithm, which constructs a quantitative model of TDP at multi-spatiotemporal scales based on GrC using the inclusion degree. The proposed TDPMTGC can describe the TDP at a single spatial or temporal scale as well as the patterns and processes of TDP at multi-spatiotemporal scales. (2) A dataset construction approach for the text GrC model is proposed to provide a feasible scheme for reorganizing large-scale unstructured text and constructing public spatiotemporal UGC tourism datasets. (3) The TDPMTGC model was successfully applied in the Jiuzhaigou area, resulting in some new insightful conclusions regarding TDP in this area. TDPMTGC provides a new data mining approach for exploring tourist behaviors and analyzing the driving mechanisms of tourism patterns and processes both spatially and temporally.

The remainder of this paper is organized as follows. The theory and approach of TDPMTGC are described in Section 2. In Section 3, we present the TDP computing method based on GrC. In Section 4, we report on a case study from Jiuzhaigou that demonstrates the feasibility of TDPMTGC. Finally, a discussion and conclusions are presented in Section 5.

2 Literature review

2.1 Semantic knowledge discovery in GIScience

The introduction of the concept of "Geographic Data Mining and Knowledge Discovery" [36] revealed that an important way to discover geographic laws in the big data era is to mine geographic data. Among this massive amount of data, more than 80% are composed of text, natural language, social media, etc. Many of these datasets exist in semi-structured or unstructured file formats, use different schema and lack geo-references or semantically meaningful links and descriptions of the corresponding geo-entities [37]. Therefore, how to mine the semantic features of spatiotemporal datasets has become an important topic in the field of GIScience. Among the possible uses are the application of social media data in tourism, geography and other fields in humanities and social sciences. Mining the semantic information related to tourism destinations, cities, energy and so on is the main focus of current research which includes studies of TDP [5,38], tourist emotion [39], intention perception [40], areas of interest [41], travel trajectory [42], energy development [43] and so on. Moreover, due to the multi-scale characteristics of geographic datasets in both the spatial and temporal dimensions, it is necessary to mine and analyze the semantic knowledge at multi-spatiotemporal scales [44]. Building a mathematical model is an effective method for discovering semantic knowledge in GIScience. The application of ontologies, Bayesian networks and other models has laid the foundation for multi-spatiotemporal semantic mining [4546].

The TDPMTGC method proposed in this paper adopts social media data, which can reflect users’ real intentions, to construct a text GrC model that can quantitatively calculate TDP at multi-spatiotemporal scales allowing it to mine semantic knowledge concerning TDP from UGC text.

2.2 Spatiotemporal data mining

Geographic spatiotemporal big data include both earth observation and human behavior data [47] whose value is reflected in hidden rules and knowledge [4850]. Among these data types, the objects of earth observations are the earth’s surface elements, and the data that can be mined include satellite remote sensing data, monitoring station data, UAV images, and so on [5152]. With location at their core, these types of data can be structured easily and represented by positional space [47]. In contrast, the main body of human behavior big data concerns human beings, and the data that can be mined include social media data [5,38], mobile phone signals [42], taxi route [53], and so on. The structural types in such data are complex and diverse and can be represented by flow spaces (including people flow, information flow, relationship flow, etc.) [16,47,54].

The Sina microblog adopted as the data source in this paper belongs to the human behavior big data type. Mining TDP in both the spatial and temporal dimensions can reveal tourists’ attention to tourist destinations from multi-spatiotemporal scales.

2.3 Popularity analysis of tourist destination

In the big data era, the volume of social media data that reflects real user preferences and the wisdom of crowds has undergone explosive growth. Integrating such big data with rich spatiotemporal information and semantics is an effective way to find popular routes, scenic spots, etc. [38]. Such knowledge can provide references that tourism managers can use to plan tourism resources and that tourists can use to plan reasonable itineraries and improve their tourism experiences. Analyzing tourists’ online comments and microblog posts can reveal their perceptions and preferences regarding a tourism destination which can then be used to reflect the TDP [22,55]. In addition, tourism destination recommender systems can help users cope with information overload, provide personalized recommendations and services, and help find popular tourist destinations [23,41,56].

The above methods take tourism destinations as a complete spatial unit when analyzing their popularity at multi-temporal scales. The method proposed in this paper divides space and time into multiple scales and then integrates them into a systematic framework to achieve a fine-grained multi-scale popularity analysis.

2.4 Granular computing model

The concept of "scale" in the field of GIScience can be modeled using the hierarchical structure of granular computing. Data granules are formal entities that facilitate a way of organizing knowledge about data and relationships. GrC is concerned with the development and processing of data granules [57]. Representing geographic objects as data granules can effectively identify spatiotemporal scales and periodic patterns and improve the logicality, systematicity and efficiency of decision-making [5859]. GrC makes it possible to flexibly adjust levels and make deductions between levels [57]. By establishing a systematic hierarchical framework [60], the evaluation results from a previous granular level can be employed as criteria in at a subsequent granular level [61], allowing locally constructed models to be used to deduce the global model [62].

To our knowledge, this paper is the first to introduce a GrC model to tourism GIScience and expand its application in GIScience. We use tourism text data granules to represent the landscape objects in tourism GIScience and depict the multi-spatiotemporal scales in tourism GIScience through the multi-hierarchical structure of GrC. Then, the multi-spatiotemporal scales TDP can be deduced within a systematic framework.

3 Theory and method

Mathematical modeling is used to build theoretical models that reflect real problems; then, solving the model can yield results that are also the solution to the real problem. In the big data era, scholars in the field of GIScience use rich social media resources [63] and mathematical methods to model complex GIScience problems [6465] and explore the patterns and motivations of human activities at multi-spatiotemporal scales [6668].

The existing approaches calculate the TDP through two methods. The first method uses mobile tourist locations to determine popularity. This approach has high spatial accuracy but lacks the semantic content of multi-spatial scales. The second method determines popularity through keywords in tourist UGC. This approach has clear semantic content but insufficient spatial accuracy. Given the problems with above methods, we introduce the TDPMTGC algorithm, which uses the inclusion degree based on conditional probability for GrC. Text data granules are applied to calculate the TDP to achieve quantitative descriptions and in-depth mining of the multi-spatiotemporal TDP. The design idea of this method is as follows. We first introduce the concept of ‘information granule’ into tourism GIScience to design the text granulation method and the data granular structure of tourism UGC with a spatiotemporal scale. Then, we introduce the concept of inclusion degree based on conditional probability. Finally, we present the TDPMTGC conceptual framework using inclusion degree. TDPMTGC is used to describe both TDP at a single spatial or temporal scale and the patterns and processes of TDP at multi-spatiotemporal scales.

3.1 Definitions of related concepts

GrC originated from the idea of ‘information granulation’, which was first introduced by Zadeh [30] in 1979. Information from different perspectives and levels is defined as different granules that can be used for massive data processing and complex problem solving [32]. GrC provides an effective theory and method for solving the thematic organization problem of big data [33]. Introducing GrC into tourism GIScience also generates the related concepts of GrC in tourism texts.

Definition 1. Tourism Text Data Granule. A tourism text block defined based on tourism text elements characterized by time, space, similarity, adjacency, uncertainty, or function [6970], denoted as Gr.

Definition 2. Tourism Text Data Granulation. The granulation process that divides large-scale and complex tourism text into small and semantically clear tourism text data granules based on a set of criteria associated with spatial and temporal scale features or other geographical thematic semantics.

Definition 3. Tourism Text Data Granular Layer. A layer composed of a set of tourism text data granules based on certain granulation criteria, denoted as L.

Definition 4. Multi-scale Tourism Text Data Granular Structure. The geographical relational structure formed by the connections between multiple tourism text data granules corresponding to different granulation criteria, denoted as GrS.

Definition 5. Tourism Text Data Granular Computing. Computing processes that use tourism text data granules to describe, analyze, and solve tourism text mining problems from different scales and perspectives [32].

3.2 Granulation criteria of tourism text

3.2.1 Representation of tourism text data granules

Geographical objects can be described via four dimensions: longitude, latitude, altitude and time. The first three dimensions constitute the space (longitude, latitude, height). Both time and space have scalar properties [28] and are related.

  1. Granule: Tourism text data granules are represented as GrjSr&Tr, where Sr and Tr represent the spatial and temporal scales of the data, respectively. Sr = {1,⋯,Smax} and Tr = {1,⋯,Tmax}, where Smax and Tmax are the total number of spatial and temporal granular layers, respectively, and j is an index reflecting the order of granules. Tourism text data granules are the basic elements of tourism text data GrC models.

  2. Spatial and temporal granular layers: A tourism text data granular layer is an abstract representation of tourism GIScience problems. Different spatial and temporal scales can be described as different spatial and temporal granular layers.

    As shown in Fig 1, SRuleSet is a set of spatial granulation criteria. The criteria for each layer are SRules&TrSRuleSet(s=1,,Smax1). Each granule Grjs&Tr in layer s is granulated into N(s+1)&Tr granules Grj(s+1)&Tr(s=1,,Smax1;j=1,2,,N(s+1)&Tr) in layer s + 1 with SRules&Tr. All the granules located in layer s + 1 constitute the tourism text data granular layer. The granules that satisfy the conditions of Grjs&TrLs&Tr(s=1,,Smax;j=1,2,,Ns&Tr) are in the same layer and usually have geospatial semantic properties that conform to the same theme or objective criteria. Similarly, granulation criteria rules TRuleSr&tTRuleSet(t=1,,Tmax1) apply to temporal scales.

  3. Granular spatial and temporal structures: Each spatial scale corresponds to a spatial granular layer Ls&Tr(s=1,,Smax), and the multi-spatial granular structure is composed of spatial granular layers using a spatial granulation criteria, SRules&Tr. Similarly, a multi-temporal granular structure is composed of spatial granular layers LSr&t(t=1,,Tmax). Taking Jiuzhaigou for example, for the 4-layer spatial granular structure of ‘tourism destinations→scenic areas→tourist routes→scenic spots’, the granule that represents Rizegou in the 3rd layer can be granulated into many scenic spot granules (belonging to Rizegou) in the 4th layer using the granulation criteria ‘tourist routes→scenic spots’.

Fig 1. Schematic diagram of the tourism text data granular structure at multi-spatiotemporal scales.

Fig 1

(a) granular structure of tourism text data; (b) popularity mining of single spatial scale data granules at a single temporal scale; (c) popularity mining of single spatial scale data granules at multi-temporal scales; (d) popularity mining of multi-spatial scale data granules at a single temporal scale; (e) popularity mining of multi-spatial scale data granules at a multi-temporal scale.

3.2.2 Granular structure of tourism text data

  1. The structure of a tourism text data granule. A tourism text data granule is a complete entity with multiple attributes, such as space and time, which must be described from spatial, temporal and other dimensions. Among them, both the spatial and temporal dimensions contain multiple scales; thus, the multi-scale structure of granules corresponds to these multi-spatiotemporal scales (see Fig 1(a)). Using this approach, the time and space dimensions are integrated into a single systematic model reflected as attributes of data granules. To describe the spatiotemporal characteristics of the data granules, it is necessary to clearly indicate their spatiotemporal scale, which can be divided into the following situations: ① To describe the characteristics of data granules at a particular spatiotemporal scale, it is necessary to fix the spatial and temporal scales of the granules (see Fig 1(b)); ② To describe the characteristics of data granules at a specific spatial (or temporal) scale, it is necessary to fix the spatial (or temporal) scale of the granules and mine the evolution rules of granules at that multi-temporal (or multi-spatial) scale (see Fig 1(c) and 1(d)); and ③ To describe the characteristics of data granules at multi-spatiotemporal scales, multiple scales of the spatial and temporal dimensions of the granules should be selected to perform comprehensive mining (see Fig 1(e)).

  2. The implementation method of multi-spatiotemporal scale granular structure. The multi-spatiotemporal scale granular structure of tourism text data is represented by the complete graph shown in Fig 1(a), in which layers 1~Smax of the multi-spatial granular structure correspond to the Smax scales. The data granules in the upper scale are transformed into those in the lower scale using the granulation criteria SRules&Tr(s=1,,Smax-1). The data granules decrease as the scale decreases. Similarly, layers 1~Tmax of the multi-temporal granular structure correspond to the Tmax scales, and granules in the upper scale are transformed into those in the lower scale using the granulation criteria TRuleSr&t(t=1,,Tmax-1). A complete graph represents the existence of an edge (i.e., a correlation) between any spatial-spatial, temporal-temporal, or spatial-temporal scales. There are Smax (Smax − 1)/2 edges among the spatial-spatial scales, Tmax (Tmax − 1)/2 edges among the temporal-temporal scales, and Smax · Tmax edges among the spatial-temporal scales; thus, the total number of edges is (Smax + Tmax)(Smax + Tmax − 1)/2. The correlation between temporal scales is presupposed by the "spatial-temporal" correlation (i.e., the correlation between two temporal scales ‘TkTl’ for a spatial scale Si is obtained by granulating Si in layers Tk and Tl, which yields the correlations ‘SiTk’ and ‘SiTl’). The granular structure of tourism text data can be used not only to mine features of small-scale landscapes (where S1 represents a tourist destination) over a short period (such as when T1 represents an annual scale) but also to mine the life cycle evolutionary laws at large scales (where S1 represents a national or even a global scale) over long periods (such as when T1 represents several centuries (if the data are available)). According to the actual needs, subgraphs can be extracted from Fig 1(a) to achieve landscape law mining at a single-space/single-time scale (see Fig 1(b)), single-space/multiple-time scales (see Fig 1(c)), multiple-space/single-time scales (see Fig 1(d)), and multiple-space/multiple-time scales (see Fig 1(e)).

In conclusion, a tourism text data granule is a unified whole possessing multiple attributes, such as a spatial and a temporal dimension. The transformations of granular layers and data granule size are achieved by scale selection in both the spatial and temporal dimensions. Therefore, all the scales between spatial and temporal dimension are related, which allows for the comparability of the data granules of all spatial-spatial, temporal-temporal and spatial-temporal layers. This approach allows for comparisons of the popularity value of data granules both among adjacent scales and across scales, forming unique information that we can gain by applying TDPMTGC to texts that cannot be obtained via other, possibly simpler, approaches (e.g., simply counting the number of visitors or the number of social media posts). We can analyze the geographic spatiotemporal relations among the multiple granular layers using the granular structure GrS. Thus, GrS is a useful tool for finely describing the multi-spatiotemporal patterns of TDP.

3.3 Multi-spatiotemporal tourism text granular computing model based on inclusion degree

The approach related to GrC include fuzzy sets [71], neighborhood topology [72], inclusion degree [73], formal concept analysis [74], algebraic lattices [75], calculus [76], and logical views [77]. Inclusion degree theory is one of the classic methods for implementing GrC [3132,78]. Inclusion degree theory contains all the results of uncertain reasoning [32]. Conditional probability is a form of uncertain reasoning, representing one kind of inclusion degree. Uncertainty exists in the multi-spatiotemporal features of tourism text data. For example, the text ‘Wucaichi’ is unclear because it does not specify the scenic area or tourist route to which the scenic spot belongs. The inclusion degree is generated based on conditional probability to quantitatively infer TDP at multi-spatiotemporal scales with different granularities, analyze the couplings of TDP between different spatial and temporal scales, and mine the evolutionary patterns of TDP.

3.3.1 Definition of inclusion degree

Definition 6. Inclusion Degree. Let DSet be a tourism text dataset and assume that DSet has three subsets DA, DB, DCDSet. If ID(DB / DA) exists and satisfies the following three properties

  1. Nonnegative: 0 ≤ ID(DB/DA) ≤ 1;

  2. Normative: ID(DB/DA) = 1, when DADB;

  3. Transitive: ID(DA/DC) ≤ ID(DA/DB), when DADBDC,

then ID(DB/DA) is the inclusion degree to which DB contains DA (or DA is contained in DB).

3.3.2 Generating the inclusion degree using conditional probability

The following formulas are appropriate for both spatial and temporal dimensions.

Theorem 1. Let X be a finite set for which GrA and GrB meet the condition that GrA, GrBX, and N(GrA) is the number of elements in GrA. Then N(GrA)/n is the corresponding probability measure. If P is the probability distribution of X, then

ID(GrB/GrA)=N(GrAIGrB)N(GrA)=P(GrAIGrB)P(GrA)=P(GrB|GrA) (1)

is the inclusion degree of X, when P(GrA) > 0.

3.3.3 The inclusion degree of multi-spatiotemporal tourism text data granularity

Property 1. If the granules of each spatial granular layer Grjs&TrLs&Tr(s=1,,Smax;j=1,2,,Ns&Tr) meet any of the following conditions

  1. A data granule in one spatial granular layer is a subset of that in the upper granular layer, i.e., Grjs&TrGrj(s1)&Tr.

  2. The collection of several data granules in one spatial granule is a subset of that in the upper granule, i.e., Grjs&TrGrj(s1)&Tr.

then the inclusion degree of the tourism text data granules of two adjacent spatial scales can be defined as ID(Grj(s1)&Tr/Grjs&Tr) (or ID((Grjs&Tr)/Grjs&Tr)), which is called the inclusion degree to which Grj(s1)&Tr contains Grjs&Tr (or Grjs&Tr contains Grjs&Tr).

Property 2. If the granules of each temporal granule layer GrjSr&tLSr&t(t=1,,Tmax;j=1,2,,NSr&t) meet any of the following conditions

  1. A data granule in one temporal granular layer is a subset of that in a higher layer, i.e., GrjSr&tGrSr&(t1).

  2. A collection of several data granules in one temporal granular layer is a subset of that in a higher layer, i.e., GrjSr&tGrjSr&tGrjSr&(t-1).

then the inclusion degree of a tourism text data granules at two adjacent temporal scales can be defined as ID(GrSr&(t1)/GrSr&t) (or ID((GrjSr&t)/GrjSr&t)). It is called the inclusion degree in which GrSr&(t1) contains GrjSr&t, (or GrjSr&t contains GrjSr&t).

3.3.4 Multi-spatiotemporal tourism text granular computing model based on inclusion degree

Combining Sect. 3.3.2 and Sect. 3.3.3, the multi-spatial tourism text GrC model based on inclusion degree can be written as

ID(Grjs&Tr/Grj(s1)&Tr)=P(Grjs&Tr|Grj(s1)&Tr)=P(Grjs&TrIGrj(s1)&Tr)/P(Grj(s1)&Tr), (2)

or

ID(Grjs&Tr/(Grjs&Tr))=P(Grjs&Tr|(Grjs&Tr))=P(Grjs&TrI(Grjs&Tr))/P(Grjs&Tr). (3)

Similarly, the multi-temporal tourism text GrC model based on inclusion degree is as follows

ID(GrSr&t/GrSr&(t1))=P(GrSr&t|GrSr&(t1))=P(GrSr&tIGrSr&(t1))/P(GrSr&(t1)), (4)

or

ID(GrjSr&t/(GrjSr&t))=P(GrjSr&t|(GrjSr&t))=P(GrjSr&tI(GrjSr&t))/P(GrjSr&t). (5)

4 The tourist destination popularity computing approach based on granular computing model

To implement TDPMTGC to support the quantitative calculation of TDP at multi-spatiotemporal scales, we first need to organize and construct a standard dataset that meets the requirements of the granular computing model. Then, we design the TDP computing approach based on the GrC model.

4.1 The granular computing model dataset construction approach

The spatial information (such as toponymy) in multi-scale unstructured UGC data is implicit in the text and needs to be identified layer by layer. Moreover, the data granules in the lower layer are subsets of those in the next highest layer and a number of cross-scale layers (i.e., tourist route granules at a tourist route scale not only include single spot granules but also single route with multiple spots granules and multiple routes with multiple spots granules at the scenic spot scale. Similarly, they include single-route and multiple-route granules at a tourist route scale). After completing the construction of granules in the lower layer, they can be directly integrated into the granules in the upper layer, thus expanding to larger granules layer by layer. Because of this inclusion relationship between scales in the spatial dimension, the dataset is constructed from bottom to top using a scale from small to large and a granular scale that moves from fine to coarse. The temporal information in UGC data is explicit in each text; thus, data granules in lower layers inherit the labels of those in the upper layers (for example, a granule at a monthly scale must belong to a certain granule at a yearly scale). We adopt a tree structure to complete the construction of the data granules in the upper layer and then decompose them downward layer by layer. This approach clearly indicates the inheritance relationship among the data granules of each layer. Hence, in the temporal dimension, based on the spatial dataset, the dataset is constructed from top to bottom using a scale from large to small and a granular scale that moves from coarse to fine.

Consequently, we use the granular structure of tourism text data described in section 3.2.2 to construct datasets that reflect the spatial and temporal dimensions, respectively. Common spatial scales are implemented in tourist GIScience, such as scenic spots, tourist routes, scenic areas, tourist destinations, provinces, nations, etc. Similarly, common temporal scales are implemented, such as year, month, week, day, hour, minute, and second. The number of spatial and temporal scales should be selected according to the size of the tourist destination (i.e., smaller scenic areas can skip the tourist route scale). In this paper, we use four scales in the spatial dimension, namely, "scenic spot—tourist route—scenic area—tourist destination", and four scales in the temporal dimension, namely, "year—month—day—time", as examples to introduce the dataset construction method of spatial and temporal dimension.

4.1.1 Dataset construction in the spatial dimension

The UGC texts related to tourist destinations are composed of two parts: text that mentions toponym features of a scenic area at different spatial scales (i.e., toponym text) and texts that do not contain any scale-related toponym features (i.e., nontoponym text). Text selection is conduct to discover toponym features at different spatial scales, such as scenic spots, tourist routes, scenic areas, and tourist destinations. Starting from the smallest granule, namely, a scenic spot, text datasets of scenic spots, tourist routes, scenic areas, and tourist destinations are successively established (see Fig 2).

Fig 2. Construction of multi-spatiotemporal scale datasets of tourism text based on granular computing.

Fig 2

(a) dataset construction in the spatial dimension; (b) data granules at different scales; (c) dataset construction in the temporal dimension. The dataset in Fig 2(a) is constructed from bottom to top in four scales, including scenic spot, tourist route, scenic area and tourist destination. Scenic spot granules at the scenic spot scale are colored yellow; and each scenic spot granule represents a scenic spot composed of three parts: a single spot, a route with multiple spots and multiple routes with multiple spots. Route granules at the tourist route scale are colored blue, and each route granule represents a route composed of five parts: single spot, one route with multiple spots, multiple routes with multiple spots, single route and multiple routes. The single spots are represented by the complete set of scenic spots in the route (i.e., union). Similarly, the same approach is used for representing one route with multiple spots and multiple routes with multiple spots. Scenic area granules at the scenic area scale are colored light blue, and each scenic area granule represents a scenic area composed of four parts: scenic spot, tourist route, scenic area and nontoponym text. Among these, scenic spots are represented by the union set of single spots, one route with multiple spots and multiple routes with multiple spots at the tourist route scale, and a tourist route is composed of the union set of single routes and multiple routes at the tourist route scale. The tourist destination granules within the scale of tourist destination are colored aqua, and each tourist destination granule represents a tourist destination, which is represented by the union of several scenic areas within the tourism destination. The dataset in Fig 2(c) is constructed using four scales—year, month, day and hour—in top-down order, and the corresponding colors are light pink, dark pink, red and yellow. The granules at each spatial scale in Fig 2(b) are divided into four temporal scales.

  1. Scenic spot scale L4&Tr: the text collection of each scenic spot, namely, a scenic spot granule Grline_j4&Tr (line ∈ {A,B,⋯,X} represents the name of a tourist route to which scenic spot j belongs), is filtered by the name of the scenic spot. The filtering results are divided into three categories: 1) a single spot, expressed as 1S4&Tr (for simplicity, A4 is used instead of 1S4&Tr in the following passage), each of which describes a single scenic spot; 2) a single route with multiple spots, expressed as 1R_mS4&Tr (B4 for short), each of which describes several scenic spots along a single tourist route; 3) multiple routes with multiple spots, expressed as mR_mS4&Tr(C4 for short), which describes multiple scenic spots along multiple routes simultaneously. Among these, the A4 granules of a tourist route can be summed to describe a complete set of individual scenic spot texts for a tourist route, satisfying Grlinee3&Tr=jlineGrlinee_j4&Tr. B4 granules do not have additivity (because a text that contains multiple scenic spots is counted multiple times). Grline*_j4&Tr refers to the number of times that scenic spot j appears in Grline*4&Tr, satisfying Grline*_j4&TrGrline*3&Tr(jline). C4 granules are also not additive. Gr&_line_j4&Tr refers to the number of times that scenic spot j appears in Gr&_line3&Tr, which satisfies Gr&_line_j4&TrGr&_line3&Tr(jline). Each scenic spot meets the requirement Grline_j4&Tr=Grlinee_j4&TrGrline*_j4&TrGr&_line_j4&Tr. For example, one text is included in three scenic spots that are attached to three routes, namely, Xiniuhai (belonging to Shuzhenggou), Nuorilangpubu (belonging to Rizegou), and Wucaichi (belonging to Zechawagou), which is a C4 and is counted three times during the calculation of scenic spot popularity. Therefore, the sum of the scenic spots in C4 is greater than the number of time they originally appear in the texts.

  2. Tourist route scale L3&Tr: a tourist route granule Grline3&Tr is made up of scenic spots Gralle2&Tr and tourist routes Grall#2&Tr. Among these, Gralle2&Tr corresponds to 1S3&Tr, 1R_mS3&Tr, and mR_mS3&Tr (A3, B3 and C3 for short) at the scenic spot scale, satisfying Gralle2&Tr=Grlinee3&TrGrline*3&TrGr&_line3&Tr. The tourist routes Grall#2&Tr include both texts describing tourist routes separately (referred to as single route and expressed as 1R3&Tr(D3 for short)), and those describing multiple routes at the same time (referred to as multiple routes and expressed as mR3&Tr(E3 for short)), satisfying Grall#2&Tr=Grline13&TrGrline#3&Tr. Each tourist route meets the requirements Grline3&Tr=Grlinee3&TrGrline*3&TrGr&_line3&TrGrline13&TrGr#_line3&Tr.

  3. Scenic area scale L2&Tr: a toponym text granule Grscenic2&Tr of a scenic area includes both scenic spots Gralle2&Tr and tourist routes Grall#2&Tr at the tourist route scale, and descriptions of single scenic areas. Grscenic12&Tr (referred to as scenic area), satisfying Grscenic2&Tr=Gralle2&TrGrall#2&TrGrscenic12&Tr.

  4. Tourist destination scale L1&Tr: a tourist destination granule Grdistrict1&Tr contains granules of each scenic area within the tourist destination, GrjSCENIC2&Tr, satisfying Grdistrict1&Tr=j=1NGrjSCENIC2&Tr. Each scenic area granule contains both toponym text granules Grscenic2&Tr and nontoponym text granules Grscenic2&Tr, satisfying GrSCENIC2&Tr=Grscenic2&TrGrscenic2&Tr.

The schematic diagram in Fig 2 shows how the granules located in each spatial granular layer are consistent with the colors of the corresponding granules in Fig 1.

4.1.2 Dataset construction in the temporal dimension

The granules at each spatial scale are granulated from top to bottom according to the temporal scale, and four temporal scale granules (year, month, day and hour) are obtained: GrjSr&1_aGrjSr&2_a/bGrjSr&3_a/b/cGrjSr&4_a/b/c/d, satisfying

GrjSr&1=aYearSetGrjSr&1_a=aYearSetbMonthSetGrjSr&1_a/b=aYearSetbMonthSetcDaySetGrjSr&1_a/b/c=aYearSetbMonthSetcDaySetdHourSetGrjSr&1_a/b/c/d, (6)

where Sr represents the scales of tourist destinations, scenic areas, tourist routes and scenic spots.

4.2 Quantifying tourist destination popularity based on granular computing

4.2.1 Tourism destination popularity at different spatial scales

A tourism text dataset organized based on the spatial dimension can support the calculation of TDP at the four spatial scales of scenic spots, tourist routes, scenic areas, and tourist destinations and can be used to explore the coupling relationships between tourist behavior and spatial semantics at different scales. The following is a detailed description of the approaches used to calculate TDP at different spatial scales.

(I) Scenic spot scale. The popularity of each scenic spot is calculated based on its tourist route, which is contributed to jointly by the popularity of A4, B4 and C4. The total popularity of scenic spots is

ID(Grline_j4&Tr/Grallspot4&Tr)=P(Grline_j4&Tr|Grallspot4&Tr)=P(Grline_j4&Tr)/P(Grallspot4&Tr). (7)

Among these, Grline_j4&Tr=Grlinee_j4&TrGrline*_j4&TrGr&_line_j4&Tr, and Grallspot4&Tr=line{A,B,,X}Grline_j4&Tr which includes

A4:ID(Grlinee_j4&Tr/Grallspot4&Tr)=P(Grlinee_j4&Tr|Grallspot4&Tr)=P(Grlinee_j4&Tr)/P(Grallspot4&Tr),
B4:ID(Grline*_j4&Tr/Grallspot4&Tr)=P(Grline*_j4&Tr|Grallspot4&Tr)=P(Grline*_j4&Tr)/P(Grallspot4&Tr),
C4:ID(Gr&_line_j4&Tr/Grallspot4&Tr)=P(Gr&_line_j4&Tr|Grallspot4&Tr)=P(Gr&_line_j4&Tr)/P(Grallspot4&Tr).

For example, there are 49 scenic spots, and the set of scenic spot granules is a subset of the "scenic spots’ at the tourist route scale. Because the numbers of scenic spots in B4 and C4 are greater than 1, the same text is counted using the number of actual scenic spots in the text. Taking “Zhenzhutanpubu” as an example, the number of A4 is 449, and the total number of texts describing scenic spots is 10,607 (that is, the sum of A4, B4 and C4 for the 49 scenic spots). Hence, the popularity of A4 of Zhenzhutanpubu is 4.23%.

Because each text describing B4 and C4 contains multiple scenic spots, the same text is counted repeatedly in multiple scenic spot granules, which leads to values larger than the actual occurrences of scenic spot texts that are contained in Grallspot4&Tr. Therefore, the distribution characteristics of different parts of scenic spot granules within the scope of the scenic area are considered, and the comprehensive popularity of scenic spots within the scope of a scenic area can be obtained:

P(Grlinek_j4&Tr)=k{e,*,&}P(Grlinek_j3&Tr|Grlinek3&Tr)P(Grlinek3&Tr)=P(Grlinee_j4&Tr|Grlinee3&Tr)P(Grlinee3&Tr)+P(Grline*_j4&Tr|Grline*3&Tr)P(Grline*3&Tr)+P(Gr&_line_j3&Tr|Gr&_line3&Tr)P(Gr&_line3&Tr)={[P(Grlinee_j4&Tr|Grlinee3&Tr)P(Grlinee3&Tr|Gre3&Tr)P(Gre3&Tr|Gralle2&Tr)+P(Grline*_j4&Tr|Grline*3&Tr)P(Grline*3&Tr|Gr*3&Tr)P(Grline*3&Tr|Gralle2&Tr)+P(Gr&_line_j3&Tr|Gr&_line3&Tr)P(Gr&_line3&Tr|Gr&3&Tr)P(Gr&_line3&Tr|Gralle2&Tr)]P(Gralle2&Tr|Grscenic2&Tr)P(Grscenic2&Tr)} (8)

(II) Tourist route scale. The popularity of each tourist route is calculated based on its scenic area, which is jointly contributed to by the popularity of scenic spots (A3, B3 and C3) and tourist routes (D3 and E3). The total popularity of tourist routes is

ID(Grline3&Tr/Grallline3&Tr)=P(Grline3&Tr|Grallline3&Tr)=P(Grline3&Tr)/P(Grallline3&Tr) (9)

Among which, Grline3&Tr=Grlinee3&TrGrline*3&TrGr&_line3&TrGrline13&TrGr#_line3&Tr, and Grallline3&Tr=line=AXGrline3&Tr.

  1. The popularity of scenic spots is
    ID((j{linee,line*,&_line}Grj3&Tr)/Grallline3&Tr)=P(j{linee,line*,&_line}Grj3&Tr)/P(Grallline3&Tr), (10)
    which includes
    A3:ID(Grlinee3&Tr/Grallline3&Tr)=P(Grlinee3&Tr|Grallline3&Tr)=P(Grlinee3&Tr)/P(Grallline3&Tr),
    B3:ID(Grline*3&Tr/Grallline3&Tr)=P(Grline*3&Tr|Grallline3&Tr)=P(Grline*3&Tr)/P(Grallline3&Tr),
    and
    C3:ID(Gr&_line3&Tr/Grallline3&Tr)=P(Gr&_line3&Tr|Grallline3&Tr)=P(Gr&_line3&Tr)/P(Grallline3&Tr).
  2. The popularity of tourist routes is
    ID((j{line1,#_line}Grj3&Tr)/Grallline3&Tr)=P(j{line1,#_line}Grj3&Tr)/P(Grallline3&Tr), (11)
    which includes
    D3:ID(Grline13&Tr/Grallline3&Tr)=P(Grline13&Tr|Grallline3&Tr)=P(Grline13&Tr)/P(Grallline3&Tr),
    and
    E3:ID(Grline#3&Tr/Grallline3&Tr)=P(Grline#3&Tr|Grallline3&Tr)=P(Grline#3&Tr)/P(Grallline3&Tr).

For example, there are 4 tourist routes, among which the set of scenic spot granules is a subset of the ‘scenic spots’ at the scenic area scale, and the set of tourist route granules is a subset of the ‘tourist routes’ at the scenic area scale. Similarly, the same texts are counted for B3, C3 and E3 using the number of actual scenic spots in the text. Taking ‘Rizegou’ as an example, the number of A3 is 3,549, and the total number of texts describing tourist routes is 8,858 (that is, the total number of A3, B3, C3, D3 and E3 on the 4 tourist routes). Hence, the popularity of A3 of Rizegou is 40.07%.

Some texts may contain multiple scenic spots or multiple tourist routes. We count them repeatedly in multiple tourist route granules, leading to more texts than the actual number of route texts in Grallline3&Tr. By considering the distribution of different parts of tourist route granules within a scenic area, the comprehensive popularity of the tourist routes within that scenic area can be obtained:

P(Grline3&Tr)=k{e,*,&,1,#}P(Grlinek3&Tr|Grk3&Tr)P(Grk3&Tr)=P(Grlinee3&Tr|Gre3&Tr)P(Gre3&Tr)+P(Grline*3&Tr|Gr*3&Tr)P(Gr*3&Tr)+P(Gr&_line3&Tr|Gr&3&Tr)P(Gr&3&Tr)+P(Grline13&Tr|Gr13&Tr)P(Gr13&Tr)+P(Gr#_line3&Tr|Gr#3&Tr)P(Gr#3&Tr)={[P(Grlinee3&Tr|Gre3&Tr)P(Gre3&Tr|Gralle2&Tr)+P(Grline*3&Tr|Gr*3&Tr)P(Grline*3&Tr|Gralle2&Tr)+P(Gr&_line3&Tr|Gr&3&Tr)P(Gr&_line3&Tr|Gralle2&Tr)]P(Gralle2&Tr|Grscenic2&Tr)+[P(Grline13&Tr|Gr13&Tr)P(Gr13&Tr|Grall#2&Tr)+P(Gr#_line3&Tr|Gr#3&Tr)P(Gr#3&Tr|Grall#2&Tr)]P(Grall#2&Tr|Grscenic2&Tr)}P(Grscenic2&Tr). (12)

The comprehensive popularity of tourist routes and scenic spots reflects their relative popularity on the macro scale—the relative popularity of scenic areas. The horizontal comparison of popularity between different scenic areas or tourist destinations can be achieved using the comprehensive popularity measures.

(III) Scenic area scale. For toponym text, the tourist destination popularity of each scenic area is contributed to by the popularity of scenic spots (1S2&Tr, 1R_mS2&Tr, and mR_mS2&Tr(A2, B2 and C2 for short)), the popularity of tourist routes (1R2&Tr and mR2&Tr(D2 and E2 for short)), and the popularity of scenic areas.

  1. The popularity of scenic spots is
    ID(Gralle2&Tr/Grscenic2&Tr)=P(Gralle2&Tr|Grscenic2&Tr)=P(Gralle2&Tr)/P(Grscenic2&Tr), (13)
    which includes
    A2:ID(Gre3&Tr/Grscenic2&Tr)=P(Gre3&Tr|Grscenic2&Tr)=P(Gre3&Tr)/P(Grscenic2&Tr),
    B2:ID(Gr*3&Tr/Grscenic2&Tr)=P(Gr*3&Tr|Grscenic2&Tr)=P(Gr*3&Tr)/P(Grscenic2&Tr),
    and
    C2:ID(Gr&3&Tr/Grscenic2&Tr)=P(Gr&3&Tr|Grscenic2&Tr)=P(Gr&3&Tr)/P(Grscenic2&Tr).
  2. The popularity of a tourist route is
    ID(Grall#2&Tr/Grscenic2&Tr)=P(Grall#2&Tr|Grscenic2&Tr)=P(Grall#2&Tr)/P(Grscenic2&Tr), (14)
    which includes
    D2:ID(Gr13&Tr/Grscenic2&Tr)=P(Gr13&Tr|Grscenic2&Tr)=P(Gr13&Tr)/P(Grscenic2&Tr),
    and
    E2:ID(Gr#3&Tr/Grscenic2&Tr)=P(Gr#3&Tr|Grscenic2&Tr)=P(Gr#3&Tr)/P(Grscenic2&Tr).
  3. The popularity of a scenic area is
    ID(Grscenic12&Tr/Grscenic2&Tr)=P(Grscenic12&Tr|Grscenic2&Tr)=P(Grscenic12&Tr)/P(Grscenic2&Tr). (15)

For example, the number of toponym texts is 34,290, and the number of texts describing a scenic spot is 8,055 (that is, the sum of A2, B2 and C2). Therefore, the popularity of scenic spots at the scenic area scale is 23.07%.

(IV) Tourist destination scale. The popularity of a tourist destination includes the popularity of each scenic area within that tourist destination.

The proportion of toponym text is

ID(Grscenic2&Tr/GrSCENIC2&Tr)=P(Grscenic2&Tr|GrSCENIC2&Tr)=P(Grscenic2&Tr)/P(GrSCENIC2&Tr), (16)

and the proportion of nontoponym text is

ID(¬Grscenic2&Tr/GrSCENIC2&Tr)=P(¬Grscenic2&Tr|GrSCENIC2&Tr)=P(¬Grscenic2&Tr)/P(GrSCENIC2&Tr) (17)

For example, the total number of texts describing Jiuzhaigou is 36,740, the number of toponym texts is 34,290; thus, the proportion of toponym texts is 93.33%.

4.2.2 Tourist destination popularity at different temporal scales

The tourism text dataset organized based on the temporal dimension can also support TDP calculations at different temporal scales [79]. Combined with the spatial dimension, the temporal dimension is helpful for exploring the temporal features and evolutionary rules of tourist group behaviors.

  1. The year scale is
    ID(¬GrjSr&1_a/GrjSr&1)=P(¬GrjSr&1_a|GrjSr&1)=P(¬GrjSr1_a)/aYearSetP(GrjSr&1_a), (18)
  2. the month scale is
    ID(GrjSr&2_a/b/GrjSr&1_a)=P(GrjSr&2_a/b|GrjSr&1_a)P(GrjSr&1_a|GrjSr&1)=P(GrjSr&2_a/b)bMonthSetP(GrjSr&2_a/b)P(GrjSr&1_a)aYearSetP(GrjSr&1_a), (19)
  3. the day scale is
    ID(GrjSr&3_a/b/c/GrjSr&2_a/b)=P(GrjSr&3_a/b/c|GrjSr&2_a/b)P(GrjSr&2_a/b|GrjSr&1_a)=P(GrjSr&3_a/b/c)cDaySetP(GrjSr&3_a/b/c)P(GrjSr&2_a/b)bMonthSetP(GrjSr&2_a/b) (20)
    ID(GrjSr&3_a/b/c/GrjSr&1_a)=P(GrjSr&3_a/b/c|GrjSr&1_a)P(GrjSr&1_a|GrjSr&1)=bMonthSetP(GrjSr&1_a/b/c)cDaySetbMonthSetP(GrjSr&1_a/b/c)P(GrjSr&1_a)aYearSetP(GrjSr&1_a), (21)
  4. and the hour scale is
    ID(GrjSr&4_a/b/c/d/GrjSr&3_a/b/c)=P(GrjSr&4_a/b/c/d|GrjSr&3_a/b/c)P(GrjSr&3_a/b/c|GrjSr&2_a/b)=P(GrjSr&4_a/b/c/d)dHourSetP(GrjSr&4_a/b/c/d)P(GrjSr&3_a/b/c)cDaySetP(GrjSr&3_a/b/c) (22)
    ID(GrjSr&4_a/b/c/d/GrjSr&2_a/b)=P(GrjSr&4_a/b/c/d|GrjSr&3_a/b/c)P(GrjSr&3_a/b/c|GrjSr&2_a/b)=cDaySetP(GrjSr&4_a/b/c/d)bMonthSetcDaySetP(GrjSr&4_a/b/c/d)P(GrjSr&2_a/b)bMonthSetP(GrjSr&2_a/b) (23)
    ID(GrjSr&4_a/b/c/d/GrjSr&1_a)=P(GrjSr&1_a/b/c/d|GrjSr&1_a/b/c)P(GrjSr&1_a|GrjSr&1)=bMonthSetcDaySetP(GrjSr&4_a/b/c/d)dHourSet(bMonthSetcDaySetP(GrjSr&4_a/b/c/d))P(GrjSr&1_a)aYearSetP(GrjSr&1_a). (24)

5 A case study from Jiuzhaigou

5.1 Descriptions of the research area and data

Jiuzhaigou is a famous national scenic area, a nature reserve, and a typical waterscape and landscape scenic area in China [8081] that spans a large area, features beautiful scenery, and attracts numerous tourists. The Sina microblog site is rich in related scenic data and is highly representative.

Our research group purchased the commercial Sina microblog application programming interface (API) and downloaded the microblog data within the spatial range of the scenic area. The microblog data used in this study was collected from 00:00:00 on January 1, 2013 to 00:00:00 on January 1, 2018. Sina microblog provides the data collection method of "collection point" and "range" to cover the spatial range of the research area. The followings are the parameters used in the experiment:

  • Collection point 1: latitude 33.216981, longitude 103.912572, range: 10000m;

  • Collection point 2: latitude 33.079005, longitude 103.898839, range: 10000m.

Finally, the data within the coverage area are filtered according to the method mentioned in the manuscript.

In total, we collected 105,226 microblog posts from 2013 to 2017, which constitutes all the Sina microblog posts published during this period regarding Jiuzhaigou. By filtering noise data (the number of noise data is 68,486), we obtained 36,740 valid tourism text entries (see Table 1) that constitute the dataset.

Table 1. Effective microblogs in Jiuzhaigou in 2013–2017.

Data Year Number of valid texts Description of main attributes
Jiuzhaigou scenic area 2013 9,431 UID: user ID;
2014 6,339 Created_at: release time;
2015 6,179 Lat/Long: Release position;
2016 7,514 Text: text content;
2017 7,277 User name: user name.
Total 36,740

5.2 Tourist destination popularity mining at multi-spatiotemporal scales

TDPMTGC integrates spatial and temporal scales into one systematic model by using GrC, which makes all the scales in spatial and temporal dimensions related. For each spatiotemporal scale, not only can the popularity of spatiotemporal units represented by each data granule be quantitatively described but the contributions of different units to the overall TDP of this scale can also be compared. The popularity contribution of data granules in the lower layer to those in the upper layer can be further quantitatively analyzed. Due to the length restrictions of this paper, we take only the correlation shown in Fig 3 as an example to conduct multi-spatiotemporal scale TDP mining to demonstrate the superior performance of TDPMTGC. The results are shown in Tables 25 and Figs 4 and 5.

Fig 3. Correlation diagram of multi-spatiotemporal scale data in Jiuzhaigou.

Fig 3

Table 2. Calculated results of TDP (abbreviated pop.) at Jiuzhaigou tourist destination scales.

Jiuzhaigou Scenic Area Scenic Area 2 Scenic Area… Scenic Area N Total text of Jiuzhaigou
Toponym text Proportion Nontoponym text Proportion
34290 93.33% 2450 6.67% omit omit omit 36740

Table 5. Calculated results of TDP (abbreviated pop.) at Jiuzhaigou scenic spot scales.

Name Type Pop. of scenic spots Comprehensive Pop.
1-S4&Tr Pop. 1-R_m-S4&Tr Pop. m-R_m-S4&Tr Pop. Total of scenic spots Sum of Pop.
Shuzhenggou Heyezhai village 19 0.18% 1 0.01% 9 0.08% 29 0.27% 0.08%
Yanazhai village 0 0 0 0 0 0 0 0 0
Panyazhai village 0 0 0 0 0 0 0 0 0
Jianpanzhai village 0 0 0 0 0 0 0 0 0
Guwazhai village 0 0 0 0 0 0 0 0 0
Penjingtan beach 28 0.26% 11 0.10% 21 0.20% 60 0.57% 0.17%
Luweihai lake 135 1.27% 21 0.20% 64 0.60% 220 2.07% 0.64%
Heijiaozhai village 0 0 0 0 1 0.01% 1 0.01% 0
Zhayizhagashenshan mountain 2 0.02% 0 0 1 0.01% 3 0.03% 0.01%
Shuanglonghai lake 14 0.13% 15 0.14% 15 0.14% 44 0.41% 0.13%
Huohuahai lake 119 1.12% 31 0.29% 69 0.65% 219 2.06% 0.64%
Huohuahaipubu waterfall 0 0 0 0 0 0 0 0 0
Wolonghai lake 11 0.10% 11 0.10% 15 0.14% 37 0.35% 0.11%
Shuzhengqunhai lake 67 0.63% 26 0.25% 38 0.36% 131 1.24% 0.38%
Shuzhengpubu waterfall 109 1.03% 30 0.28% 65 0.61% 204 1.92% 0.59%
Shuzhengzhai village 148 1.40% 24 0.23% 41 0.39% 213 2.01% 0.62%
Laohuhai lake 118 1.11% 47 0.44% 81 0.76% 246 2.32% 0.72%
Xiniuhai lake 133 1.25% 46 0.43% 119 1.12% 298 2.81% 0.87%
Rizegou Nuorilangqunhai lake 0 0 1 0.01% 0 0 1 0.01% 0.00%
Nuorilangpubu waterfall 389 3.67% 52 0.49% 165 1.56% 606 5.71% 1.77%
Nuorilangbaohuzhongxin other 0 0 0 0 0 0 0 0 0
Semonvshenshan mountain 0 0 0 0 2 0.02% 2 0.02% 0.01%
Jinghai lake 301 2.84% 85 0.80% 97 0.91% 483 4.55% 1.41%
Dagenanshenshan mountain 0 0 0 0 0 0 0 0 0
Zhenzhutan beach 215 2.03% 85 0.80% 79 0.74% 379 3.57% 1.11%
Zhenzhutanpubu waterfall 449 4.23% 89 0.84% 91 0.86% 629 5.93% 1.83%
Jinlinghai lake 2 0.02% 3 0.03% 2 0.02% 7 0.07% 0.02%
Kongquehedao lake 3 0.03% 10 0.09% 4 0.04% 17 0.16% 0.05%
Wuhuahai lake 931 8.78% 211 1.99% 246 2.32% 1388 13.09% 4.05%
Xiongmaohai lake 299 2.82% 160 1.51% 133 1.25% 592 5.58% 1.73%
Xiongmaohaipubu waterfall 20 0.19% 22 0.21% 5 0.05% 47 0.44% 0.14%
Jianzhuhai lake 359 3.38% 138 1.30% 113 1.07% 610 5.75% 1.78%
Jianzhuhaipubu waterfall 95 0.90% 25 0.24% 18 0.17% 138 1.30% 0.40%
Rizegoubaohuzhongxin other 0 0 0 0 0 0 0 0 0
Tian’ehai lake 25 0.24% 25 0.24% 6 0.06% 56 0.53% 0.16%
Fangcaohai lake 4 0.04% 12 0.11% 3 0.03% 19 0.18% 0.06%
Jianyanxuanquan waterfall 0 0 1 0.01% 0 0 1 0.01% 0
Yuanshisenlin forest 457 4.31% 55 0.52% 73 0.69% 585 5.52% 1.71%
Zangmalonglihai lake 0 0 0 0 0 0 0 0 0
Zechawagou Zechawazhai village 12 0.11% 0 0 8 0.08% 20 0.19% 0.06%
Xiajijiehai lake 3 0.03% 10 0.09% 3 0.03% 16 0.15% 0.05%
Ganzigonggaishan mountain 0 0 0 0 0 0 0 0 0
Shangjijiehai lake 6 0.06% 4 0.04% 2 0.02% 12 0.11% 0.03%
Wucaichi lake 1456 13.73% 205 1.93% 345 3.25% 2006 18.91% 5.85%
Changhai lake 768 7.24% 201 1.89% 288 2.72% 1257 11.85% 3.67%
Zharugou Zharusi temple 19 0.18% 2 0.02% 4 0.04% 25 0.24% 0.07%
Baojingyan mountain 1 0.01% 2 0.02% 3 0.03% 6 0.06% 0.02%
Rexizhai village 0 0 0 0 0 0 0 0 0
Guoduzhai village 0 0 0 0 0 0 0 0 0

Fig 4. The variation tendencies of the popularity of Jiuzhaigou at various temporal scales.

Fig 4

(a1) scenic areas at the monthly scale L1&2. (a2) scenic areas at the hourly scale L1&4. (a3) scenic areas for 2013 at the daily scale L1&3_2013/b/c. (a4) scenic areas for 2017 at the daily scale L1&3_2017/b/c. (b1) tourist routes at the monthly scale L1&2. (b2) tourist routes at the hourly scale L1&4. (c1) scenic spots at the monthly scale L1&2. (c2) scenic spots at the hourly scale L1&4. (d1) popularity based on the numbers of microblog posts in August 2017. (d2) popularity based on the numbers of microblog posts in Rizegou.

Fig 5. Comprehensive popularity of scenic spots represented by text data granules at the scenic spot scale.

Fig 5

5.2.1 The spatiotemporal model associated with the scenic area

General characteristics: at the scenic area scale, TDP mainly includes the Jiuzhaigou features of the scenic area and the features of each tourist route (see Table 3). The contribution of Jiuzhaigou in the scenic area reaches 76%, while those from different routes or scenic spots account for only approximately 24%. More than 90% of the texts reference a single tourist route (Single-Spot, 1-Route-N-spots and Single-Route). Overall, most descriptions of Jiuzhaigou describe the entire scenic area.

Table 3. Calculated results of TDP (abbreviated pop.) at Jiuzhaigou scenic area scales.
Pop. of scenic spots Pop. of tourist routes Pop. of scenic areas Number of toponym text
1-S2&Tr Pop. 1-R_m-S2&Tr Pop. m-R_m-S2&Tr Pop. 1-R2&Tr Pop. m-R2&Tr Pop. Scenic Area Pop.
6717 19.59% 739 2.16% 599 1.75% 72 0.21% 15 0.04% 26148 76.26% 34290

Correlation 1–1 (scenic area, year-month): the monthly changing patterns are basically the same across 2014, 2015 and 2016. The peak-season lasted from June to October. The popularity began to increase significantly in June, reaching a small peak in August, slightly decreasing in September, and then reaching the annual peak in October, while the off-season lasted from November to May of the following year. The popularity decreased significantly in November and reached its lowest trough in December and January before steadily recovering from February to May. These observations are consistent with the conclusions reached by Wang [5] and Yan [82]. However, significant abnormal trends occurred in 2013 and 2017. The two typical peaks present in June and October of 2013 were much higher than those in other years. A sudden increase in popularity also occurred in August 2017 and then sharply decreased after reaching the peak. These anomalies are related to policy or tourism events and can be further interpreted and analyzed based on the distribution patterns at the daily scale.

Correlation 1–2 (scenic area, month-day): At the daily scale, we selected several months for which to analyze the daily popularity changes in 2013 and 2017. The variations in daily popularity in June, August and October 2013 are illustrated in Fig 4(b1). The daily variation in August fluctuates randomly. The abnormal popularity in June 2013 occurred mainly from the 10th to 13th, which overlapped with the Dragon Boat Festival holiday, i.e., the second holiday in which the free expressway was implemented in October 2012, thus intensifying tourists’ desire to travel and leading to a sudden increase in TDP. The abnormal popularity in October lasted mainly from the 2nd to 6th, coinciding with the National Day holiday. The popularity reached its highest value on October 2, corresponding to the large-scale tourist detention event on that day. The daily variation in the August anomaly in 2017 had its highest peak from August 8–11 as shown in Fig 4(b2), which coincided with the period of the 7.0-magnitude earthquake in Jiuzhaigou County on August 8. These results are consistent with the conclusions of Cao [83], which indicated that the disaster event was the main factor leading to the increase in popularity in Jiuzhaigou during this period.

Correlation 1–3 (scenic area, month-hour): The popularity continuously increases from 6:00 to 22:00, with a slightly fluctuating pattern. In addition, the daily variation in hot months is significantly different from those in other months. During the peak popularity months from June to October, the daily variation shows a pattern with three peaks and two valleys: a peak at approximately 12:00, a higher peak at 15:00–16:00, the highest peak at 21:00–22:00, a valley at 13:00–14:00 and the lowest trough at 18:00–19:00. This pattern is consistent with the characteristics of sightseeing tours, dining, and rest times in summer and autumn. During the off-season months from November to May, the daily variation shows a pattern of two peaks and one valley (peaks at 16:00–17:00 and 21:00–22:00 and a valley at 18:00–19:00), or an insignificant peak and valley. These patterns are associated with the characteristics of sightseeing tours, dining, and rest times in winter and spring.

5.2.2 The spatiotemporal model associated with tourist route

Correlation 2–1 (tourist route-scenic area): The popularity rankings of the four tourist routes from high to low are Rizegou, Zechawagou, Shuzhenggou, and Zharugou. Rizegou contains the most scenic spots, and the popularity of this tourist route contributes >50% of the popularity of all tourism routes (see Table 4). With 6 representative landscapes, the popularity of Zechawagou reached 33%, ranking second among the four routes. Although many Shuzhenggou contains many scenic spots, its popularity is lower than those of Rizegou and Zechawagou. Few tourists pay attention to Zharugou, resulting in very low popularity value.

Table 4. Calculated results of TDP (abbreviated pop.) at Jiuzhaigou tourist route scales.
Pop. of scenic spots Pop. of tourist routes Total of tourist routes Sum of pop. Comprehensive pop.
1-S3&Tr Pop. 1-R_m-S3&Tr Pop. m-R_m-S3&Tr Pop. 1-R3&Tr Pop. m-R3&Tr Pop.
903 10.19% 109 1.23% 293 3.31% 24 0.27% 9 0.10% 1338 15.10% 3.90%
3549 40.07% 419 4.73% 533 6.02% 35 0.40% 13 0.15% 4549 51.35% 13.27%
2245 25.34% 209 2.36% 464 5.24% 13 0.15% 11 0.12% 2942 33.21% 8.58%
20 0.23% 2 0.02% 7 0.08% 0 0 0 0 29 0.33% 0.08%

Correlation 2–2 (tourist route—scenic spot): The popularity of each tourist route is mainly contributed to by the scenic spots on each route, whose contribution rate is close to 100%, among which the contribution rates of single scenic spots are all >67%. Microblog users who mention tourist route spatial units rarely describe the names of tourism routes but they do directly describe specific scenic spots, and most describe single scenic spots.

Correlation 2–3 (tourist route-scenic spot, year-month): The popularity variation tendencies of tourist routes at the monthly scale within a year were basically consistent with those of the scenic area scale. The high popularity period lasted from June to October. The high popularity of Rizegou shows an obvious pattern of three peaks and two valleys, which is consistent with the high popularity of scenic area in June and October, indicating the contribution of Rizegou's popularity to its scenic area. There is no inter-monthly fluctuation based on the popularity anomalies in each route, which means no popularity anomaly was caused by inter-monthly or seasonal landscapes with higher popularity.

Correlation 2–3 (tourist route-scenic spot): There were significant differences in the variation tendency of the three main lines within the day. ① The popularity of Rizegou is always the highest, while Shuzhenggou is always the lowest, indicating that visitors pay different attention to those routes. ② The hours during which the popularity of Rizegou, Zechawagou and Shuzhenggou increased significantly were 8:00, 9:00 and 15:00, respectively; the peak hours are 10:00–15:00, 11:00–15:00 and 15:00–17:00, respectively, and the peaks occurred at 12:00, 14:00 and 15:00, respectively, while the troughs appeared at 20:00, 19:00 and 20:00, respectively. The above characteristics indicate that the route popularity is obviously affected by tourism guides, and most tourists start by entering Rizegou and Zechawagou and only later visit Shuzhenggou. The popularity variation tendency of the route can be further analyzed by the scenic spot scale model.

5.2.3 The spatiotemporal model associated with scenic spots

At the scenic spot scale, the popularity of the top-15 hot spots accounts for approximately 90% of all the scenic spots; therefore, we selected only the Top-15 scenic spots for this discussion.

Correlation 3–1 (scenic spot-tourist route): the popularity of scenic spots with different types or on different tourist routes differ (see Table 5 and Fig 5). Three scenic spots in the first level have the highest popularity: Wucaichi and Changhai in Zechawagou and Wuhuahai in Rizegou (in bold underlined font). The scenic spots in the second level with high popularity are concentrated in Rizegou, including the 7 scenic spots of Zhenzhutanpubu, Jianzhuhai, Nuorilangpubu, Xiongmaohai, Yuanshisenlin, Jinghai and Zhenzhutan (in bold font). The scenic spots in Shuzhenggou are ranked only at the third level and include Luweihai, Huohuahai, Shuzhengzhai, Laohuhai and Xiniuhai (underlined font). In general, most of the popular scenic spots are water-related landscapes, such as lakes and waterfalls; these account for approximately 1/3 of the scenic spots, while the other scenic spots have relatively lower popularity, among which folk customs and cultural landscapes (such as villages) have the lowest popularity. These results are highly consistent with the conclusion of Tang [40] that ‘most tourists consider the natural landscape in Jiuzhaigou, while the Tibetan villages and other cultural landscapes with important folk culture and historical values are not recognized enough’. At the same time, TDPMTGC quantitatively analyzed the contribution of granules in the lower layer to the TDP of the upper layers, improving the understanding of tourist behaviors.

The type characteristics of scenic spot popularity can explain the differences in route popularity from another perspective. The lakes of Rizegou account for approximately half of the route, and the 4 hot lakes effectively improve the popularity of the route. There are fewer lakes in Zechawagou, but the first-level-popularity landscapes Wucaichi and Changhai support the high route popularity. There are more scenic spots in Shuzhenggou, but half of them are villages, which have low popularity, leading to the low route popularity.

Correlation 3–2 (scenic spot-tourist route, year-month): the variation tendency of scenic spot popularity at the monthly scale shows three modes: single peak, double peak and triple peak. ① Single peak: The single peak of Wuhuahai, Zhenzhutanpubu, Xiongmaohai, Nuorilangpubu in Rizegou appears in June, August and October, while the peak of Huohuahai in Shuzhenggou appears in August. ② Double peak: the double peak of Rizegou appears in June and October and includes Yuanshisenlin, Jinghai and Zhenzhutan. The double peak of Zechawagou, namely Wucaichi and Changhai, appears in June, August and October. The double peak of Shuzhenggou appears in June and October and includes scenic spots such as Shuzhengzhai, Laohuhai and Xiniuhai. ③ Triple peak: The triple peak appears only for Jianzhuhai in Rizegou, also in June, August and October. By analyzing the proportions of the peak months of each scenic spot in the annual popularity of the scenic spot and their contribution to the annual popularity of tourist routes, the characteristics of the popularity of scenic spots on tourist routes can be obtained. For example, Zhechawagou has two of the highest-popularity scenic spots: Wucaichi (double peak in August and October) and Changhai (double peak in June and October). The peak months accounted for <23.89%, 20.68%>, <20.99%, 22.79%> of the annual popularity of scenic spots, while their contributions to the annual popularity of tourist routes are <10.51%, 9.1%>, <4.87%, 5.29%>, respectively. Therefore, the tourist route peaks in October and the peak season lasts from June to October. In summary, the peaks of high-popularity scenic spots all appeared in June, August and October, and the total numbers of peaks are 10, 4 and 10, respectively. The proportion of each monthly peak in the annual popularity of scenic spots accounts for 20%–33% (Huohuahai reaches approximately 66%); the contributions of the first-level popular scenic spots to the popularity of tourist routes ranges from 4.87–10.51%, and the contribution of the second-level and third-level popular scenic spots to the popularity of tourist routes is generally less than 3%. The popularity of a tourist route is affected by its scenic spots, and the appearance and duration of its peaks are consistent with those of its scenic spots. Combined with the monthly variation tendency of scenic spots, the tourist route and scenic area popularity modes are relatively consistent.

Correlation 3–3 (scenic spot-tourist route, day-hour): The initial period, peak type and popularity level of its scenic spots have a significant influence on the popularity mode of the tourist route. For example, the popularity of scenic spots in Rizegou begins to rise at 7:00–9:00, and the time spans of their peak occurrences are large. The peak of Wuhuahai at 12:00 makes a large contribution to its tourist route. The peak values in Yuanshisenlin and Jianzhuhai reach 0.82% at 9:00 and last for 2 hours. The peak values of Xiongmaohai, Wuhuahai and Jinghai are 0.71%, 1.64% and 0.42%, respectively, from 11:00–13:00, lasting for 1~3 hours. The peak values of Zhenzhutanpubu and Nuorilangpubu are 0.008 at 15:00, lasting for 1~3 hours. After accumulation, the popularity of Rizegou rises at 7:00, its peak appears at 12:00, and its high popularity lasts from 11:00–15:00 with a popularity value ranging from 3.83%–4.39%. Similarly, through the superposition of the popularity of all scenic spots in their tourist route, the popularity of Zechawagou increases at 9:00 and reaches a double peak with a popularity value between 2.51% and 2.97% from 11:00–15:00, while the popularity of Shuzhenggou presents a pattern rises slightly and then falls at 8:00. Its popularity rises again significantly at 14:00, a peak appears at 15:00, and this peak lasts from 15:00–16:00 with a popularity value from 1.24%–1.4%.

5.2.4 The relationship between popularity variation tendency and numbers of microblog posts

The temporal variation of the TDP is calculated based on the numbers of microblog posts during the same time period. The two variations are similar but not identical, and there are three main differences.

  1. Source data and reorganized data. The temporal variations in TDP as calculated by TDPMTGC are based on the text dataset after data reorganization rather than on the source data of microblog posts during the same period of the research area. Taking the data in 2017 as an example, 20,764 pieces of source data were focused on Jiuzhaigou in 2017, although this number was reduced to 7,277 after data reorganization. A comparison of the daily variation patterns within months (see Fig 4(d1)) showed that their overall trend was consistent and both were affected by the earthquake in Jiuzhaigou on August 8. However, the source data contain texts that are unrelated to the research area; thus, the variations are not exactly the same.

  2. Intersections between data granules. Intersections occur between data granules at some spatial scales, and the intersecting parts of the text belong to multiple granules. When calculating the comprehensive popularity of data granules, it is necessary to include the intersecting parts of the text in multiple granules at the same time, resulting in a text expansion compared with the source data, and these variations are slightly different from the changing trends in the number of microblog posts. For example, the route granules at the tourist route scale include a single spot, one route with multiple spots, multiple routes with multiple spots, single route and multiple routes, among which multiple routes with multiple spots and multiple routes granules simultaneously belong to multiple route granules. Therefore, the absolute number of routes is slightly different from the overall number of microblog posts (see Fig 4(d2)).

  3. The popularity value of the same data granules can be different at different scales. For example, the popularity of Wucaichi at the scenic spot scale is 18.91% as calculated based on scenic spots, while its comprehensive popularity in the scenic area is 5.85% (see Table 5). Moreover, due to the different inclusion relationships of data granules at different scales, there is not necessarily a proportional relationship between the popularity values (i.e., multiple routes with multiple spots granules belong to multiple tourist route granules at the tourist route scale but only to one granule in the scenic area scale and thus are calculated differently on different scales).

5.2.5 Summary

TDPMTGC allows for the comparability of the data granules of all spatial-spatial, temporal-temporal and spatial-temporal layers and then achieves the comparison of popularity values of data granules between adjacent scales and across scales. Detailed and quantitative descriptions of TDP at multi-spatiotemporal scales are helpful for comprehensively and deeply exploring the spatiotemporal characteristics of tourism from the viewpoint of tourists’ cognition.

6 Discussion

Guided by the data granulation approach, TDPMTGC normalizes and recombines the spatiotemporal information expressed explicitly or implicitly in unstructured tourism texts into a systematic framework that integrates the spatial and temporal dimensions, ensuring that its popularity calculations are measurable and comparable.

Previous TDP research approaches have important implications for this paper. We take the approaches of Hu [41], Wang [5], Tang [40] as examples and compare them with the TDPMTGC proposed in this paper to both acknowledge the inheritance TDPMTGC owes to the existing approaches as well as its further innovations and to reflect its potential advantages and value in future applications, leading to further research questions.

  1. Dataset. Before the advent of the big data era, questionnaires represented the main method of obtaining user data (e.g., Tang et al [40]). However, the rapid development of the Internet has caused the data scale to explode. Increasingly, scholars focus on mining social media data, such as Flickr photos and microblog data (e.g., Hu et al [41] and Wang et al [5]). TDPMTGC uses the full content of tourism UGC texts, which contain rich spatiotemporal and semantic information that is conducive to in-depth explorations of the rules governing tourists’ spatiotemporal behaviors and analysis of the driving mechanisms of tourism spatial patterns and processes. This approach better reflects users’ real emotional trends than does data collected based on specific research objectives, such as questionnaire surveys and interviews, and it reduces the differences caused by sparse or inconsistent samples. For example, analyzing the variation tendency of popularity of Jiuzhaigou at the daily scale, we find that the unusual period of attention by tourists is associated with holidays, special policies, tourism events and sudden disasters. The feature extraction of tourism UGC text from an abnormal time period can be used to analyze users’ emotional trends. One advantage of TDPMTGC is that the data types it can use are unrestricted. Although we chose text for this study, other types of data could also be employed, and we plan to conduct further research using Flickr photos.

  2. Methodology. Hu et al [41] designed a three-layer framework to extract areas of interest (AOIs) from geotagged photos to understand the spatiotemporal dynamics of these areas. Tang et al [40] constructed a model of tourists’ sense of place and studied their perceptions and evaluations of tourist destinations from four dimensions: natural scenery, social cultural setting, tourism function, and affectional attachment. Wang et al [5] used the kernel density estimation (KDE) algorithm to analyze tourists’ attention to the landscape at multi-spatiotemporal scales. Most of the existing methods have regarded a tourist destination as an integral spatial unit for studying evolutionary rules at multi-temporal scales. While others consider multi-spatiotemporal scales, there is no correlation concerning the values between scales, which affects the accuracy of these approaches. Inspired by the existing methods, TDPMTGC fully considers the spatiotemporal scale characteristics of big data. Tourism text data granules are used to represent landscape objects in tourism geography, the multi-spatiotemporal scales in tourism GIScience are depicted by the multi-hierarchical structure of GrC, and the spatial and temporal dimensions are integrated into a systematic framework as attributes of the data granules. In this way, quantitative calculations of multi-spatiotemporal scales and popularity deduction between adjacent scales and across scales can be achieved. The potential advantages and values of this approach will be reflected by the following aspects in future applications.
    • ① TDPMTGC has good semantic scalability. UGC data are granularized and reorganized based on spatiotemporal scales to form text data granules with clear spatiotemporal semantics. Moreover, the granulation criteria can be extended to geography or to other thematic semantics, such as tourism emotion, sightseeing, consumption behaviors and service perceptions. Thus, this approach can not only quantitatively calculate tourist spatial popularity but can also be combined with other methods for studying tourist spatiotemporal behaviors, landscape preferences, and spatial images. TDPMTGC has a wide range of applications and can be used to support different research goals in tourism, geography or other fields of humanities and social sciences.
    • ② TDPMTGC has good adaptability to spatiotemporal scales and types of tourist destinations. In terms of scale design, the granulation criteria of each layer are independent. The data in the upper scale are mapped to the data in lower scales through granulation criteria between each layer. Making changes in the granular layers and scale requires changing only the granulation criteria between the affected adjacent granular layers, which will not affect other granular layers. Therefore, the number of spatiotemporal scales can be adjusted dynamically based on the scale and development characteristics of tourist destinations when using TDPMTGC. For example, some tourist destinations, such as ancient cities, have no tourist routes; thus, the spatial scales could be simplified and the tourist route layer could be deleted. TDPMTGC is applicable to tourist destinations with different types and themes, for example, nature and humanity, which can facilitate comparative studies involving different types of tourist destinations.
    • ③ TDPMTGC can be adapted to dynamic changes in the data. The granular structure of tourism text data supports the expansion of dynamic incremental data in a specific granular layer without affecting other layers. TDPMTGC can dynamically calculate TDP corresponding to the varying granular layers and achieve real-time monitoring of TDP at multi-spatiotemporal scales.
  3. Experimental results. By comparing the AOI growth model, Hu et al [41] found that AOIs in developed cities have large initial areas but slow development speeds, while AOIs in rapidly developing cities have low initial values but significant growth rates. Tang et al [40] found that the natural landscape of Jiuzhaigou has received high perception evaluation scores and presents good general recognition by tourists, while the perception evaluation scores of its social and cultural environment are relatively low. Wang et al [5] discovered popularity routes and scenic spots in Jiuzhaigou by mining the spatial pattern and evolutionary processes of tourists’ attention at multi-spatiotemporal scales. TDPMTGC not only obtained conclusions consistent with these previous results but also revealed detailed features of TDP that were not described in previous studies because it allows a quantitative analysis of the driving forces of tourism phenomena. These results suggest that TDPMTGC has better precision and quantitative and cross-scale calculation and deduction abilities compared with previous approaches.

7 Conclusions and future work

In this paper, we introduce the idea of GrC into tourism GIScience, allowing quantitative calculations of TDP to be conducted based on unstructured tourism UGC text. We propose the granular structure of multi-spatiotemporal tourism text data, design a GrC model of tourism text based on inclusion degree, and implement a text mining approach to calculate TDP based on GrC. The main contributions of TDPMTGC include the following: (1) A regularized data recombination based on granular structure is achieved for unstructured tourism UGC. This recombination includes both implicit spatial semantics and explicit temporal semantics, which can improve tourism GIScience research based on unstructured text data mining. (2) We can describe TDP at both single spatial or temporal scale as well as the patterns and processes of TDP at multi-spatiotemporal scales using data granular layers corresponding to the spatiotemporal scales. (3) The inclusion degree based on conditional probability can be used to describe spatial popularity and standardizes basic spatiotemporal units at different spatiotemporal scales, which can be used to quantify the contribution degrees of different spatial and temporal units to TDP.

The results of the presented case study of Jiuzhaigou are consistent with previous results [5,40], confirming the feasibility and effectiveness of TDPMTGC. The main conclusions are as follows. (1) From the perspective of landscape preference, tourists pay more attention to the natural landscape of Jiuzhaigou, especially its water-related landscapes. (2) Based on the spatial characteristics of TDP, different tourist routes have different popularities: Rizegou and Zechawagou have higher popularity; Shuzhenggou has the lowest popularity; and Wucaichi, Huohuahai and Nuorilangpubu are representative landscapes with high popularity. (3) According to the temporal pattern of TDP, there are monthly differences: the peak season, with high popularity lasts from June to October, while the off-season runs from November to May of the next year. The daily variations in the popularity of attractions present three patterns in different seasons: three peaks and two valleys, two peaks and one valley, or no significant peaks and valleys. In addition, abnormal popularity peaks at yearly, monthly and daily scales were also identified in our results.

The case study of Jiuzhaigou reveals detailed features of TDP that have not been described in previous studies, which supports quantitative analysis of the driving forces of tourism phenomena. (1) TDP at a finer spatial scale can explain the contributions of that scale to overall popularity at the macro scale. For example, Wucaichi and Changhai, which reach the highest popularity level, make a decisive contribution to the high popularity of Zechawagou. This approach also accurately locates periods with abnormal popularity, such as the Dragon Boat Festival, National Day Golden week, and the Jiuzhaigou earthquake, from the daily tourism patterns in June and October 2013 and August 2017, providing a quantitative explanation for these abnormal phenomena. (2) A comprehensive analysis of multi-spatiotemporal scales is implemented, revealing the new tourism spatial cognition. This analysis reveals a phenomenon in which most microblog users’ descriptions of Jiuzhaigou exist at the Jiuzhaigou scenic area scale, while fewer than one-quarter of users clearly describe specific tourist routes or scenic spots. The users who describe small-scale spaces typically focus on a scenic spot, which reflects a significant weakening of the tourist route scale in tourists’ cognition. The comprehensive analysis at multi-temporal scales shows that the variation curves of the peak- and off-seasons are obviously different: the analysis shows a pattern of three peaks and two valleys in the peak season, while the off-season presents double peaks with a single valley or unremarkable peaks and valleys.

This paper focuses on the initial design of TDPMTGC, which constitutes an exploration of the methodology of tourism GIScience. The following issues need further research: 1) an automatic classification algorithm needs to be designed for performing data granulation; 2) the spatial position and scale features need to be accurately measured and the related parameters of TDPMTGC need to be adjusted and optimized; 3) the results of Jiuzhaigou need to be calculated from more perspectives, and the spatiotemporal behavior characteristics of tourists need to be analyzed at a more detailed spatiotemporal scale; and 4) more cases of different types should be tested for comparison purposes, in which the spatial patterns and evolutionary rules of tourism should be identified with respect to scenic areas, tourist destinations and larger regional scales, providing quantitative data and calculation results to support the analysis of the driving mechanisms of tourism spatial evolution.

Supporting information

S1 File. Supporting document for the use of dataset.

(PDF)

Data Availability

Data were purchased from Beijing Weimengkechuang network technology co. LTD (北京微梦科创网络技术有限公司), which owns the commercial Sina microblog. The authors confirm that interested researchers can replicate their study findings in their entirety by directly obtaining the data from the third-party and following the protocol in our Methods section. Other researchers would be able to access the data set in the same manner as the authors, and the authors did not have any special access privileges that others would not have. The authors provide the following information about Sina microblog: located at Sina headquarters building, building 8, west district, no.10 Xibeiwang East Road, Haidian district, Beijing (北京市海淀区西北旺东路10号院西区8号楼新浪总部大厦); URL: https://open.weibo.com/wiki/C/2/place/nearby_timeline/biz.

Funding Statement

Funded by 1. LRJ, grant number 41471127, Li Renjie, the National Natural Science Foundation of China, 2. LRJ, grant number D2015205208, Li Renjie, the Hebei Outstanding Youth Science Fund Cultivation Project, 3. CYX, grant number CXZZBS2018108, Chi Yunxian, the 2018 Hebei Province Doctoral Postgraduate Innovation Funding Project.

References

  • 1.Goodchild MF. Geographic information science. International Journal of Geographical Information Systems. 1992; 6 (1): 31–45. [Google Scholar]
  • 2.Lin ZH, Ma YF, Liu XF and Gao N. Spatial and temporal features of network attention of scenic areas. Resources Science. 2012; 34(12): 2427–2433. [Google Scholar]
  • 3.Zhou XL and Li ZT. On tourists’ online destination information search contents based on Baidu index—a case study of Xi’an. Xinjiang Finance and Economics. 2016; 4: 72–80. [Google Scholar]
  • 4.Lv T, Li JY, Dai L, Wang M and Yang M. The influence of eWOM on rural tourism behavioral intentions as illustrated by Xi’an urban residents. Tourism Tribune. 2018; 33(2): 48–56. [Google Scholar]
  • 5.Wang SC, Guo FH, Fu XQ and Li RJ. A study of the spatial patterns of tourist sightseeing based on volunteered geographic information: the case of the Jiuzhai valley. Tourism Tribune. 2014; 29(2): 84–92. [Google Scholar]
  • 6.Liu DJ, Hu J, Cheng SW, Chen JZ and Zhang Q. Spatial pattern and influencing factors of tourism micro-blogs in China: a case of tourism Sina micro-blogs. Scientia Geographica Sinica. 2015; 35(6): 717–724. [Google Scholar]
  • 7.Ju SL, Tao ZM and Hang YL. Coupling coordination degree between rural scenic tourist network attention and gravity in Nanjing city. Economic Geography. 2017; 37(11): 220–228. [Google Scholar]
  • 8.Tieskens KF, VanZanten BT, Schulp CJE and Verburg PH. Aesthetic appreciation of the cultural landscape through social media: an analysis of revealed preference in the Dutch river landscape. Landscape and Urban Planning. 2018; 177, 128–137. [Google Scholar]
  • 9.Yang XZ, Sun JD, Lu L and Wang Q. Spatial characteristics and social effects of residential spaces in the tourist destination Qiandaohu. Acta Geographica Sinica. 2018; 73(2): 276–294. [Google Scholar]
  • 10.Hu Z, Zheng WW, Liu PL and Liu XY. The forms and structures of traditional landscape genome maps: a case study of Hunan Province. Acta Geographica Sinica. 2018; 73(2): 317–332. [Google Scholar]
  • 11.Yang J, Ge YT, Xi JC, Ge QS and Li XM. Spatial-temporal island tourismification effects differentiation of Changhai county. Acta Geographica Sinica. 2016; 71(6): 1075–1087. [Google Scholar]
  • 12.Qi HL, Liu JS and Mei L. Progress of tourism area life cycle theory. Scientia Geographica Sinica. 2018; 38(2): 264–271. [Google Scholar]
  • 13.Zhang JZ and Sun GN. Life cycle and upgrade of Shanxi’s mansion as a tourist destination: taking Qiao’s gram compound as an example. Geographical Research. 2012; 31(11): 2104–2114. [Google Scholar]
  • 14.Guo AX, Guo YZ, Li HJ and Sun XF. Relationship between perceived tourism impacts and perceived quality of life of community residents in tourist destinations. World Regional Studies. 2017; 26(5): 115–127. [Google Scholar]
  • 15.Lu S and Wu X. Assessment of tourist satisfaction of the painting tourism in the ancient villages: the case study of Hongcun village, Yixian county. Geographical Research. 2017; 36(8): 1570–1582. [Google Scholar]
  • 16.Kim S, Jeong S, Woo I, Jang Y, Maciejewski R and Ebert DS. Data flow analysis and visualization for spatiotemporal statistical data without trajectory information. IEEE Transactions on Visualization and Computer Graphics. 2018; 24(3): 1287–1300. 10.1109/TVCG.2017.2666146 [DOI] [PubMed] [Google Scholar]
  • 17.Reitsamer BF, Brunner-Sperdin A and Stokburger-Sauer NE. Destination attractiveness and destination attachment: The mediating role of tourists’ attitude. Tourism Management Perspectives. 2016; 19: 93–101. [Google Scholar]
  • 18.Wang X, Li X, Zhen F and Zhang JH. How smart is your tourist attraction?: Measuring tourist preferences of smart tourism attractions via a FCEM-AHP and IPA approach. Tourism Management. 2016; 54: 309–320. [Google Scholar]
  • 19.Stylidis D, Shani A and Belhassen Y. Testing an integrated destination image model across residents and tourists. Tourism Management. 2017; 58; 184–195. [Google Scholar]
  • 20.Zhu H, Liu JM, Tao H and Zhang J. Evaluation and spatial analysis of tourism resources attraction in Beijing based on the Internet Information. Journal of Natural Resources. 2015; 30(12): 2081–2094. [Google Scholar]
  • 21.Li JJ, Xu LZ, Tang L, Wang SY and Li L. Big data in tourism research: A literature review. Tourism Management. 2018; 68: 301–323. [Google Scholar]
  • 22.Pantano E, Priporas CV and Stylos N. ‘You will like it!’ using open data to predict tourists’ response to a tourist attraction. Tourism Management. 2017; 60: 430–438. [Google Scholar]
  • 23.Zheng XY, Luo YL, Sun LP, Zhang J and Chen FL. A tourism destination recommender system using users’ sentiment and temporal dynamics. Journal of Intelligent Information Systems. 2018; 51: 557–578. [Google Scholar]
  • 24.Chen SL, Tao HY, Li XL and Zhuo L. Discovering urban functional regions using latent semantic information: Spatiotemporal data mining of floating cars GPS data of Guangzhou. Acta Geographica Sinica. 2016; 71(3): 471–483. [Google Scholar]
  • 25.Zhou CH. The value of spatial data in the age of big data—evaluation of 《The theory and application of spatial data mining》. Acta Geographica Sinica. 2016; 71(7): 1281. [Google Scholar]
  • 26.Zhou SH, Hao XH and Liu L. Validation of spatial decay law caused by urban commercial center’s mutual attraction in polycentric city: spatio-temporal data mining of floating cars’ GPS data in Shenzhen. Acta Geographica Sinica. 2014; 69(12): 1810–1820. [Google Scholar]
  • 27.Zheng ZJ, Du SH, Wang YC and Wang Q. Mining the regularity of landscape-structure heterogeneity to improve urban land-cover mapping. Remote Sensing of Environment. 2018; 214: 14–32. [Google Scholar]
  • 28.Cai YL, Chen YG, Yan WM, Liu WD and Qi QW. Geography: scientific status and social function. Beijing: Science press; 2017. [Google Scholar]
  • 29.Fu BJ. Geography: From knowledge, science to decision making support. Acta Geographica Sinica. 2017; 72(11): 1923–1932. [Google Scholar]
  • 30.Zadeh LA. Fuzzy sets and information granularity. Advances in Fuzzy Set Theory and Applications. 1979; 3–18. [Google Scholar]
  • 31.Liu Q, Sun H and Wang HF. The present studying state of granular computing and studying of granular computing based on the semantics of rough logic. Chinese Journal of Computers. 2008; 31(4): 543–555. [Google Scholar]
  • 32.Xu WH, Mi JS and Wu WZ. Granular computing methods and applications based on inclusion degree. Beijing: Science Press; 2017. [Google Scholar]
  • 33.Xu J, Wang GY and Yu H. Review of big data processing based on granular computing. Chinese Journal of Computers. 2015; 38(8): 1497–1517. [Google Scholar]
  • 34.Ferrante M, Magno GLL and Cantis SD. Measuring tourism seasonality across European countries. Tourism Management. 2018; 68: 220–235. [Google Scholar]
  • 35.Silva FBe, Herrera MAM, Rosina K, Barranco RR, Freire S and Schiavina M. Analysing spatiotemporal patterns of tourism in Europe at high-resolution with conventional and big data sources. Tourism Management. 2018; 68: 101–115. [Google Scholar]
  • 36.Harvey JM and Han JW. Geographic data mining and knowledge discovery. London: CRC Press; 2009. [Google Scholar]
  • 37.Neumaiera S, Savenkova V and Polleres A. Geo-semantic labelling of open data. Procedia Computer Science. 2018; 137: 9–20. [Google Scholar]
  • 38.Chen N, Peng X and Huang Z. Popularity analysis of tourist attraction based on geotagged social media big data. Science of Surveying and Mapping. 2016; 41(12): 167–171+ 216. [Google Scholar]
  • 39.Su SL, Wan C, Hu YX and Cai ZL. Characterizing geographical preferences of international tourists and the local influential factors in China using geo-tagged photos on social media. Applied Geography. 2016; 73: 26–37. [Google Scholar]
  • 40.Tang WY, Zhang J, Luo H, Yang XZ and Li DH. The characteristics of natural scenery sightseers’ sense of place: a case study of Jiuzhaigou, Sichuan. Acta Geographica Sinica. 2007; 62(6): 599–608. [Google Scholar]
  • 41.Hu YJ, Gao S, Janowicz K, Yu BL, Li WW and Prasad S. Extracting and understanding urban areas of interest using geotagged photos. Computers, Environment and Urban Systems. 2015; 54: 240–254. [Google Scholar]
  • 42.Tan JY, Dong LX, Gao J, Guo WW and Li ZX. The methods of extracting spatiotemporal characteristics of travel based on mobile phone data. In: 2018 IEEE 7th Data Driven Control and Learning Systems Conference (DDCLS). Enshi: IEEE; 2018.
  • 43.Li RP, Croweb J, Leifer D, Zou L and Schoof J. Beyond big data: Social media challenges and opportunities for understanding social perception of energy. Energy Research & Social Science. 2019; 56: 101217. [Google Scholar]
  • 44.Steiger E, Resch B, Zipf A. Exploration of spatiotemporal and semantic clusters of Twitter data using unsupervised neural networks. International Journal of Geographical Information Science. 2016; 30(9): 1694–1716. [Google Scholar]
  • 45.Cui KJ, Jiang YY, Li Y and Pfoser D. A vocabulary recommendation method for spatiotemporal data discovery based on Bayesian network and ontologies. Big Earth Data. 2019; 3(3): 220–231. [Google Scholar]
  • 46.Sun K, Zhu YQ, Pan P, Hou ZW, Wang DX, Li WR, et al. Geospatial data ontology: the semantic foundation of geospatial data integration and sharing. Big Earth Data. 2019; 3(3): 269–296. [Google Scholar]
  • 47.Pei T, Liu YX, Guo SH, Shu H, Du YY, Ma T, et al. Principle of big geodata mining. Acta Geographica Sinica. 2019; 74(3): 586–598. [Google Scholar]
  • 48.Fayyad U, Piatetsky-Shapiro G and Smyth P. The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM. 1996; 39(11): 27–34. [Google Scholar]
  • 49.Benz UC, Hofmann P, Willhauck G, Lingenfelder I and Heynen M. Multi-resolution, object-oriented fuzzy analysis of remote sensing data for GIS-ready information. ISPRS Journal of Photogrammetry and Remote Sensing. 2004; 58(3/4): 239–258. [Google Scholar]
  • 50.Manyika J. Big data: The next frontier for innovation, competition, and productivity. Analytics, 2011. [Google Scholar]
  • 51.Taubenböck H, Wiesner M, Felbier A, Marconcini M, Esch T and Dech S. New dimensions of urban landscapes: The spatio-temporal evolution from a polynuclei area to a mega-region based on remote sensing data. Applied Geography. 2014; 47: 137–153. [Google Scholar]
  • 52.Mahmoud SH and Gan TY. Irrigation water management in arid regions of Middle East: Assessing spatio-temporal variation of actual evapotranspiration through remote sensing techniques and meteorological data. Agricultural Water Management. 2019; 212: 35–47. [Google Scholar]
  • 53.Liu Q, Ding C and Chen P. A panel analysis of the effect of the urban environment on the spatiotemporal pattern of taxi demand. Travel Behaviour and Society. 2020; 18: 29–36. [Google Scholar]
  • 54.Cao N, Lin CG, Zhu QH, Lin YR, Teng X and Wen XD. Voila: visual anomaly detection and monitoring with streaming spatiotemporal data. IEEE Transactions on Visualization and Computer Graphics. 2018; 24(1): 23–33. 10.1109/TVCG.2017.2744419 [DOI] [PubMed] [Google Scholar]
  • 55.Matthews Y, Scarpa R and Marsh D. Cumulative attraction and spatial dependence in a destination choice model for beach recreation. Tourism Management. 2018; 66: 318–328. [Google Scholar]
  • 56.Bentaleb A, Bouzekri YE, Lahcen AA and Boulmalf M. Context Aware Recommender Systems for Tourism: A Concise Review. In: IEEE 5th International Congress on Information Science and Technology. Marrakech: IEEE. 2018.
  • 57.Pedrycz W. Granular computing for data analytics: a manifesto of human-centric computing. IEEE/CAA Journal of Automatica Sinica. 2018; 5(6): 1025–1034. [Google Scholar]
  • 58.Zhou KC, Tian ZS, Yang YW. Periodic pattern detection algorithms for personal trajectory data based on spatiotemporal multi-granularity. IEEE Access. 2019; 7: 99683–99693. [Google Scholar]
  • 59.Cabrerizo FJ, Morente-Molinera JA, Alonso S, Pedrycz W and Herrera-Viedma E. Improving consensus in group decision making with intuitionistic reciprocal preference relations: A granular computing approach. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC). Miyazaki: IEEE; 2018.
  • 60.Lu W, Shan D, Pedrycz W, Zhang LY, Yang JH and Liu XD. Granular fuzzy modeling for multidimensional numeric data: A layered approach Based on Hyperbox. IEEE Transactions on Fuzzy Systems. 2019; 27(4): 775–789. [Google Scholar]
  • 61.Yu H, Sun ZY, Wang GY, Li J, Xie YF and Guo G. A multi-granular hierarchical evaluation model for multiple criteria three sorting. In: 2018 IEEE International Conference on Data Mining Workshops (ICDMW). Singapore: IEEE; 2018.
  • 62.Al-Hmouz R, Pedrycz W, Balamash AS and Morfeq A. Hierarchical system modeling. IEEE Transactions on Fuzzy Systems. 2018; 26(1): 258–269. [Google Scholar]
  • 63.Longley PA and Adnan M. Geo-temporal Twitter demographics. International Journal of Geographical Information Science. 2016; 30(2): 369–389. [Google Scholar]
  • 64.Peuquet DJ, Robinson AC, Stehle S, Hardisty FA and Luo W. A method for discovery and analysis of temporal patterns in complex event data. International Journal of Geographical Information Science. 2015; 29(9): 1588–1611. [Google Scholar]
  • 65.Jiang JC, Li QQ, Tu W, Shaw SL and Yue Y. A simple and direct method to analyse the influences of sampling fractions on modelling intra-city human mobility. International Journal of Geographical Information Science. 2019; 33(3): 618–644. [Google Scholar]
  • 66.Kwan MP and Neutens T. Space-time research in GIScience. International Journal of Geographical Information Science. 2014; 28(5): 851–854. [Google Scholar]
  • 67.Huang W and Li S. An approach for understanding human activity patterns with the motivations behind. International Journal of Geographical Information Science. 2019; 33(2): 385–407. [Google Scholar]
  • 68.Kang JY and Aldstadt J. Using multiple scale spatio-temporal patterns for validating spatially explicit agent-based models. International Journal of Geographical Information Science. 2019; 33(1): 193–213. 10.1080/13658816.2018.1535121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Zadeh LA. Some reflections on soft computing, granular computing and their roles in the conception, design and utilization of information/intelligent systems. Soft Computing. 1998; 2: 23–25. [Google Scholar]
  • 70.Xiao SL. Review and prospect of spatiotemporal geographic information system and science In: The frontiers of geographic information science. Beijing: Higher Publishing House Education; 2017. [Google Scholar]
  • 71.Hu QH, Zhang LJ, Zhou YC and Pedrycz W. Large-scale multimodality attribute reduction with multi-kernel fuzzy rough sets. IEEE Transactions on Fuzzy Systems. 2018; 26(1): 226–238. [Google Scholar]
  • 72.Xie J, Chen ZH, Xie G and Lin TY. Knowledge mining in big data—a lesson from algebraic geometry. In: 2013 IEEE International Conference on Granular Computing (GrC). Beijing: IEEE; 2013.
  • 73.Gong MG, Li H, Zhang X, Zhao QN and Wang B. Nonparametric statistical active contour based on inclusion degree of fuzzy sets. IEEE Transactions on Fuzzy Systems. 2016; 24(5): 1176–1192. [Google Scholar]
  • 74.Ganter B and Wille R. Formal Concept Analysis: Mathematical Foundations. New York: Springer-Verlag; 1999. [Google Scholar]
  • 75.Chen LS, Wang JY and Li L. The models of granular system and algebraic quotient space in granular computing. Chinese Journal of Electronics. 2016; 25(6): 1109–1113. [Google Scholar]
  • 76.Zhu XB, Pedrycz W and Li ZW. Granular encoders and decoders: a study in processing information granules. IEEE Transactions on Fuzzy Systems. 2017; 25(5): 1115–1126. [Google Scholar]
  • 77.Hobbs JR. Granularity. In: International Joint Conference on Artificial Intelligence (IJCAI). Los Angeles: Morgan Kaufmann. 1985; 432–435.
  • 78.Bai H, Li D, Ge Y and Wang JF. A spatial heterogeneity-based rough set extension for spatial data. International Journal of Geographical Information Science. 2019; 33(2): 240–268. [Google Scholar]
  • 79.Sadahiro Y. Analysis of the appearance and disappearance of point objects over time. International Journal of Geographical Information Science. 2019; 33(2): 215–239. [Google Scholar]
  • 80.Du SY, Guo CX and Jin MZ. Agent-based simulation on tourists’ congestion control during peak travel period using Logit model. Chaos, Solitons and Fractals. 2016; 89: 187–194. [Google Scholar]
  • 81.Xu FF and Fox D. Modelling attitudes to nature, tourism and sustainable development in national parks: a survey of visitors in China and the UK. Tourism Management. 2014; 45: 142–158. [Google Scholar]
  • 82.Yan L, Xu XG and Zhang XP. Analysis to temporal characteristics of tourist flows on Jiuzhaigou world natural heritage. Acta Scientiarum Naturalium Universitatis Pekinensis. 2009; 45(1): 171–177. [Google Scholar]
  • 83.Cao YB and Mao ZJ. Analysis of the spatial and temporal characteristics of disaster information about the Jiuzhaigou, Sichuan MS 7.0 earthquake based on data mining of Sina Weibo. Earthquake Research in China. 2017; 33(4): 613–625. [Google Scholar]

Decision Letter 0

Song Gao

17 Sep 2019

PONE-D-19-21928

Measuring multispatiotemporal scale tourism attraction based on text granular computing

PLOS ONE

Dear Dr. Renjie,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

The three expert reviewers provided constructive comments and suggestions to further improve the research. Please address them in a major revision, especially thinking about the following two major points.

(1) Contributions: e.g., what unique information can we gain by applying TAMTGC to texts, which cannot be obtained via other possibly simpler approaches.

(2) The authors did not integrate spatial and temporal scales into one systematic model; this paper is rather a multi-spatial-scale and multi-temporal-scale approach than a multi-spatiotemporal approach.

We would appreciate receiving your revised manuscript by Oct 31 2019 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Song Gao, Ph.D.

Academic Editor

PLOS ONE

Journal requirements:

When submitting your revision, we need you to address these additional requirements.

Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

1. In your manuscript you note, "Our research group purchased the Sina microblog commercial interface and downloaded the microblog data within the spatial range of the scenic area.." You also have provided data in the supporting information files. Please confirm in your response to reviewers that you have the necessary permissions to share the data. If the data are owned by a third party and you do not have the necessary permissions to publish the data in the Supporting Information files, in the Data Availability Statement, please (1) explain the data sharing restrictions and (2) provide sufficient information for other researchers to obtain the data in the same way you did. Also, in the Methods, ensure that you have provided sufficient details for others to be able to find the same data and replicate the analyses.

2. Please remove your figures from within your manuscript file, leaving only the individual TIFF/EPS image files, uploaded separately.  These will be automatically included in the reviewers’ PDF.

3. Please ensure that you refer to Figure 3 in your text as, if accepted, production will need this reference to link the reader to the figure.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: N/A

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: This paper introduces granular computing model into tourism geography to better describe tourism attractions through multispatiotemporal scales. However, I have a few questions that I hope the authors can re-consider:

The multispatiotemporal scales are not well designed or explained. The authors proposes four spatial scales: scenic spot scale, tourist route scale, scenic area scale, and tourist destination scale. However, there is no justification of why and how these four levels are chosen. Similarly, the temporal scales include year, month, day, and hour scales. The authors also did not justify the reason of choosing such temporal scales.

In Section 3.1, the spatial dimension is constructed from smaller scale to larger scale, but the temporal scale division is conducted from larger scale to smaller scale. The authors need to justify the reason.

The authors mentioned that existing approaches were treating "tourist destination as a whole unit", so this research is to provide an advanced approach by dividing the tourist destination into smaller scales. However, this research proposes a multi-spatiotemporal approach. The authors need to provide reasons why temporal scale is also included in the introduction section.

In calculating the attraction values, this research uses count of a specific scenic spot/route/etc. from the user generated text. However, the authors did not justify why 'count' is sufficient to calculate the attraction value. The authors should discuss the possibilities of other attributes as well. In addition, the uncertainties of analyzing user generated text should be discussed to justify that 'count' or other chosen attribute is positively correlated to attraction of the scenic spot/route/etc. This is because mentioning a specific scenic spot/route/etc. may not necessarily indicate their attractiveness, but may be of other reasons, e.g. accident, negative experience, hours of operation, etc.

One of my biggest concerns of this research is that the authors did not integrate spatial and temporal scales into one systematic model. As demonstrated in approach section and case study section, the approaches are dealing with the spatial scale and then the temporal scale separately. To me, this research is not proposing a multi-spatiotemporal approach, but rather a multi-spatial-scale and multi-temporal-scale approach. I would be intrigued to see spatial and temporal scales truly integrated in one granular computing model, which will make a great contribution to GIScience.

Reviewer #2: This paper proposes to use a granular computing model TAMTGC for analyzing tourism attraction based on user generated content (UGC). The authors described the methodological details of TAMTGC, and conducted an experiment based on Jiuzhaigou using Sina microblog posts.

Strong points:

- The research topic of applying granular computing to analyzing tourism attraction is interesting.

- The authors provided detailed descriptions on the methodology.

Weak point:

- The added value of TAMTGC is unclear. In other words, what unique information can we gain by applying TAMTGC to texts, which cannot be obtained via other possibly simpler approaches? For example, in lines 51-52, the authors wrote: "Tourism attraction can be expressed using the number of visitors [2-4], the index related to online search and evaluation, and the User Generated Content (UGC) published by tourists [5-7]." So what unique and additional information can we gain using TAMTGC compared with e.g., using simply the number of visitors or the number of social media posts? To address this, the authors may need to do two things. First, the authors may need to enrich the introduction section to clarify the unique information obtained by TAMTGC. Second, the authors may need to add some comparisons in their case study of Jiuzhaigou to show the additional information that can be obtained by TAMTGC.

Other more detailed issues:

- Lines 82-85: The authors may consider also discussing the following related paper on analyzing UGC for discovering interesting zones.

Hu, Y., Gao, S., Janowicz, K., Yu, B., Li, W., & Prasad, S. (2015): Extracting and understanding urban areas of interest using geotagged photos, Computers, Environment and Urban Systems, 54, 240-254.

- Line 468: "In total, we collected >100,000" It would be better to use the exact number of posts here.

- Is the dataset used in the case study all Sina microblog posts published during this period in the study area or only a sample? Please clarify.

- Table 5 has too much information and is overwhelming. Maybe the authors can highlight some values with bold font.

- Figure 4: Would the temporal variation of the attraction be similar or different from the numbers of microblog posts in the same time period? The authors may need to provide a comparison and discussion here.

- Lines 582-583: "TAMTGC can use the full volume of the tourism UGC texts, which can better reflect users' real emotional trends" ? Could the authors provide some explanation on "emotional trends" and how TAMTGC can help discover these emotional trends?

Reviewer #3: This paper presented a new method to calculate tourism attraction from textual social media data, and developed a new methodology to conduct analysis from multi-spatiotemporal scale. In general, this paper did some good contribution to spatiotemporal data mining and semantic knowledge discovery. The author claims several aspects of contributions.

The comments are as follows:

1. It is better to add a new section “Literature Review” or “Existing Work” to summary previous research on related method of semantic knowledge discovery in GIScience, related spatiotemporal data mining method, tourism attraction analysis, granular computing model, etc. And then reorganize the section of Introduction.

2. The paper claims 5 aspects of contributions in introduction section. In my opinion, some contributions are not significant enough. For example, the 4th item “TAMTGC is extensible” cannot be thought of as a contribution. And Item 1 and 2 can be combined to illustrate the contribution of TAMTGC. Item 5 should be modified to claim the TAMTGC model was successfully applied in Jiuzhaigou area to obtain some new insightful research conclusion of tourist attractions in this area.

3. In Section 3, some formulas are very long and not very readable. Especially, in some sentences, some formulas have to be inserted, which makes readers confusing. For example, “the total number of A, B and C of 49 scenic spots”, and “the attraction of A of Zhenzhutanpubu is XXXXX”. A suggestion is, the authors can replace some formulas with simple symbols (use letters A, B, C, or use simple words), and use these simple symbols in sentences when complex formulas have to appear.

4. In Section 4.2, the result of spatial scale is described using table including different place names as rows. It would be better to use maps to obtain better result visualization effects. Especially, most readers are not familiar with where Jiuzhaigou is, and where the locations of different tourist spots are. So a map of Jiuzhaigou describing locations of different travel spots could be helpful.

5. From Table 2-5, it could be found that most calculation results are VERY small between 0.0000 and 0.0100. Can the authors consider some data normalization method, to normalize the intermediate data and final results to a value between 0.0 and 1.0, or a tourist attraction score between 0.0 and 100.0?

6. Some word and grammar errors can be found. There is a logic error in the FIRST sentence of this paper. It should be “tourism GIScience mainly studies a series of basic problems in XXXXX …” During my review of this paper, more than 10 grammar errors were found, including tense inconsistency and preposition errors. In addition, “multi-spatiotemporal” should be used instead of “multispatiotemporal”. When using “multi” with other nouns, there should always be a “-“ between them.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Apr 9;15(4):e0228175. doi: 10.1371/journal.pone.0228175.r002

Author response to Decision Letter 0


20 Nov 2019

Responses to reviewers concerning the revision of manuscript [PONE-D-19-21928] (Measuring multi-spatiotemporal scale tourist destination popularity based on text granular computing)

Dear editors and reviewers,

Thank you very much for your comments and suggestions on our manuscript to further improve the presentation of our research. After carefully reading, thinking on and re-considering the comments, we found that your questions were of great help for improving this manuscript and the suggestions are good and quite pertinent. Regarding the opinions concerning the revisions, we have made substantial modifications to this manuscript that we believe are advantageous to our research. We thank you again for your comments and suggestions. In addition, I would like to thank all the review experts and editors for their recognition of our manuscript. Our specific modifications and explanations concerning the manuscript are listed below for each reviewer and each comment:

Academic editor:

Comment 1:

At this time, please confirm in your response to reviewers that you have the necessary permissions to share the data. If the data are owned by a third party and you do not have the necessary permissions to publish the data in the Supporting Information files: please (1) explain the data sharing restrictions and (2) please clarify whether or not you had special access privileges to the data that others did not have. (3) please also provide us with information for a non author point of contact that interest researchers may contact regarding data access.

Revision:

(1) The data sharing restrictions.

We have signed an agreement with a third party. In view of the particularity of commercial data, the data in this paper are only used for scientific research exchange.

(2) Please clarify whether or not you had special access privileges to the data that others did not have.

We have special access privileges to the data that others did not have. Because the commercial data are not public, the data used in this paper were downloaded by purchasing the Sina microblog commercial interface, the full amount of microblog data about the research area could be obtained during the purchase period, while only a small amount of public data could be downloaded without purchase.

(3) Please also provide us with information for a non author point of contact that interest researchers may contact regarding data access.

The data we purchased have been downloaded and saved on our hard drive and are accessible at any time, and we have submitted the dataset to Journal of PLOS ONE. We have asked the data administrator to agree that the data in this paper can be shared on the premise of scientific research exchange. Journal of PLOS ONE has access to our minimal dataset and can serve as a non author point of contact for these queries. Therefore, interested researchers can contact our corresponding authors or Journal of PLOS ONE for access to the data. Researchers who obtain data are asked not to spread the data widely.

The data administrator also has access to our minimal dataset and can serve as a non author point of contact for these queries. Moreover, if researchers would like additional data, they can contact the data administrator to make a purchase:

Data administrator: Youzhi Liu

Telephone: 008610-60619366

Email: youzhi@staff.weibo.com

Please note that due to the dynamic nature of microblog data, microblog users may delete some data; therefore, the data downloaded in different periods may vary but will not differ overall.

Comment 2:

Please do not include funding sources in the Acknowledgments or anywhere else in the manuscript file. Funding information should only be entered in the financial disclosure section of the submission system.

Revision:

We apologize for adding extra information in the manuscript. We have deleted the funding sources in the Acknowledgments.

Thank you very much for your patient guidance and help. If there are any mistakes that need to be corrected, please let us know and we will correct them carefully and timely.

Comment 3:

Before we can proceed with your paper, please address the following queries:

a) Please confirm that the data you submitted is your 'minimal data set', which PLOS defines as consisting of the data set used to reach the conclusions drawn in the manuscript with related metadata and methods, and any additional data required to replicate the reported study findings in their entirety. This includes:

1) The values behind the means, standard deviations and other measures reported;

2) The values used to build graphs;

3) The points extracted from images for analysis.

b) Please confirm that you have permission to publish these data under a CC BY 4.0 license.

Revision:

a) The data we submitted is our 'minimal data set', which PLOS defines as consisting of the data set used to reach the conclusions drawn in the manuscript with related metadata and methods, and any additional data required to replicate the reported study findings in their entirety. This includes:

1) The values behind the means, standard deviations and other measures reported;

2) The values used to build graphs;

3) The points extracted from images for analysis.

b) We have permission to publish these data under a CC BY 4.0 license.

Reviewer #1:

Comment 1:

The multi-spatiotemporal scales are not well designed or explained. The authors proposes four spatial scales: scenic spot scale, tourist route scale, scenic area scale, and tourist destination scale. However, there is no justification of why and how these four levels are chosen. Similarly, the temporal scales include year, month, day, and hour scales. The authors also did not justify the reason of choosing such temporal scales.

Revision:

Thank you for your constructive comments. Multi-spatiotemporal scales that are not well designed or explained would directly affect the reader's understanding of the framework structure of this paper. Your comments have played an important role in improving our manuscript. Thank you for your constructive suggestions.

To better explain the multi-spatiotemporal scales of the TDPMTGC method, we have revised the contents of Section 3.2.2 “Granular structure of tourism text data” and extended the space and time scales to multiple scales rather than a fixed four-level scale. The multi-spatiotemporal scale granular structure of tourism text data is represented by the complete graph shown in Fig 1(a), in which layers of the multi-spatial granular structure correspond to the scales. The data granules in the upper scale are transformed into those in the lower scale using the granulation criteria . The data granules decrease as the scale decreases. Similarly, layers of the multi-temporal granular structure correspond to the scales, and granules in the upper scale are transformed into those in the lower scale using the granulation criteria . A complete graph represents the existence of an edge (i.e., a correlation) between any spatial-spatial, temporal-temporal, or spatial-temporal scales. There are edges among the spatial-spatial scales, edges among the temporal-temporal scales, and edges among the spatial-temporal scales; thus, the total number of edges is . The correlation between temporal scales is presupposed by the "spatial-temporal" correlation (i.e., the correlation between two temporal scales ‘ — ’ for a spatial scale is obtained by granulating in layers and , which yields the correlations ‘ — ’ and ‘ — ’). The granular structure of tourism text data can be used not only to mine features of small-scale landscapes (where represents a tourist destination) over a short period (such as when represents an annual scale) but also to mine the life cycle evolutionary laws at large scales (where represents a national or even a global scale) over long periods (such as when represents several centuries (if the data are available)).

Common spatial scales are implemented in tourist GIScience, such as scenic spots, tourist routes, scenic areas, tourist destinations, provinces, nations, etc. Similarly, common temporal scales are implemented, such as year, month, week, day, hour, minute, and second. The number of spatial and temporal scales should be selected according to the size of the tourist destination (i.e., smaller scenic areas can skip the tourist route scale). In this paper, we use four scales in the spatial dimension, namely, "scenic spot—tourist route—scenic area—tourist destination", and four scales in the temporal dimension, namely, "year—month—day—time", as examples to introduce the dataset construction method of spatial and temporal dimension.

Consequently, in Section 3 “Theory and method”, the spatial scales are no longer limited to four levels (scenic spot scale, tourist route scale, scenic area scale, and tourist destination scale) and the temporal scales are no longer limited to four levels (year, month, day and hour). However, in Section 4 (the tourist destination popularity computing approach based on granular computing model section) and Section 5 (the experimental section—a case study from Jiuzhaigou), we selected four scales each for the spatial and temporal dimensions as an example to clearly describe the approach for constructing the granular computing model dataset and the results from applying the TDPMTGC model to Jiuzhaigou because these are the usual spatial and temporal scales selected in this field.

Please refer to lines 283–304 on pages 12–13 and lines 397–406 on page 17 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 2:

In Section 3.1, the spatial dimension is constructed from smaller scale to larger scale, but the temporal scale division is conducted from larger scale to smaller scale. The authors need to justify the reason.

Revision:

Thank you very much for your constructive comments. If the construction approach of the granular computing model dataset is not clearly explained, it could confuse readers. Thus, your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

The spatial information (such as toponymy) in multi-scale unstructured UGC data is implicit in the text and needs to be identified layer by layer. Moreover, the data granules in the lower layer are subsets of those in the next highest layer and a number of cross-scale layers (i.e., tourist route granules at a tourist route scale not only include single spot granules but also single route with multiple spots granules and multiple routes with multiple spots granules at the scenic spot scale. Similarly, they include single-route and multiple-route granules at a tourist route scale). After completing the construction of granules in the lower layer, they can be directly integrated into the granules in the upper layer, thus expanding to larger granules layer by layer. Because of this inclusion relationship between scales in the spatial dimension, the dataset is constructed from bottom to top using a scale from small to large and a granular scale that moves from fine to coarse. The temporal information in UGC data is explicit in each text; thus, data granules in lower layers inherit the labels of those in the upper layers (for example, a granule at a monthly scale must belong to a certain granule at a yearly scale). We adopt a tree structure to complete the construction of the data granules in the upper layer and then decompose them downward layer by layer. This approach clearly indicates the inheritance relationship among the data granules of each layer. Hence, in the temporal dimension, based on the spatial dataset, the dataset is constructed from top to bottom using a scale from large to small and a granular scale that moves from coarse to fine.

Please refer to lines 380–396 on pages 16–17 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 3:

The authors mentioned that existing approaches were treating "tourist destination as a whole unit", so this research is to provide an advanced approach by dividing the tourist destination into smaller scales. However, this research proposes a multi-spatiotemporal approach. The authors need to provide reasons why temporal scale is also included in the introduction section.

Revision:

Thank you very much for your constructive comment.

Research in the field of tourism geography usually includes a time scale. Although the previous popularity analysis methods of tourist destination made multi-scale divisions on the temporal scale, they regarded a tourist destination as an integral unit on the spatial scale and often ignored its internal spatial characteristics, which affected the precision of the method. In-depth analysis of the spatiotemporal characteristics between scales helps improve model precision. However, establishing an accurate relationship between text and spatial units of different scales and integrating multi-spatial and multi-temporal scales into a systematic model are still obstacles in the study of tourism GIScience.

To accurately granulate the spatial and temporal information of tourism text, a tourism text data granule is used to represent a landscape object, which is a unified whole that possesses multiple attributes, such as spatial and temporal dimensions. The multi-spatiotemporal scales are characterized by the multi-hierarchical structure of GrC, and the transformations of granular layers and data granule size are realized by the scale selection in spatial and temporal dimensions. Therefore, all scales between the spatial and temporal dimension are related, thus making the data granules of all spatial-spatial, temporal-temporal and spatial-temporal layers comparable. This approach achieves a quantitative description and comparison of the popularity value of granules between adjacent scales and cross-scales. Therefore, the tourist destination popularity with multi-spatiotemporal scales can be calculated in a systematic framework.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to lines 81–88 and 101–112 on pages 4–5 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 4:

In calculating the attraction values, this research uses count of a specific scenic spot/route/etc. from the user generated text. However, the authors did not justify why 'count' is sufficient to calculate the attraction value. The authors should discuss the possibilities of other attributes as well. In addition, the uncertainties of analyzing user generated text should be discussed to justify that 'count' or other chosen attribute is positively correlated to attraction of the scenic spot/route/etc. This is because mentioning a specific scenic spot/route/etc. may not necessarily indicate their attractiveness, but may be of other reasons, e.g. accident, negative experience, hours of operation, etc.

Revision:

Thank you very much for your constructive comment.

We are sorry that your understanding of this paper differs from the meaning we want to express due to our unclear explanation. We want to convey the concept of measuring multi-spatiotemporal scale ‘tourist destination popularity’ based on text granular computing, namely, tourists' attention to the landscape at different spatiotemporal scales. The number of texts published by tourists about a tourist destination reflects their attention to the landscape of that destination at different spatiotemporal scales. Therefore, we can use the total mentions of a specific scenic spot/route/etc. from the user-generated text to calculate ‘popularity’ values. When we wrote the paper, we mistakenly used the word ‘attraction’ to mean ‘popularity’, which may have caused the ambiguity concerning this topic. We have modified the whole paper and changed ‘tourism attraction’ to ‘tourist destination popularity (TDP)’.

The text of posts published by tourists are ‘counted’ to reflect the ‘tourist destination popularity’ at various spatiotemporal scales. This ‘popularity’ includes both positive and negative impressions (e.g., accidents, negative experiences, insufficient hours of operation, etc.). For example, in Section 4.3, several abnormal popularity months were found through the monthly scale popularity variation tendency model, and the driving factors of the abnormal popularity months were then further analyzed through the daily scale popularity distribution model. The abnormal popularity in June 2013 occurred mainly from the 10th to 13th, which overlapped with the Dragon Boat Festival holiday, i.e., the second holiday in which the free expressway was implemented in October 2012, thus intensifying tourists' desire to travel and leading to a sudden increase in tourist destination popularity. The abnormal popularity in October mainly lasted from the 2nd to 6th, coinciding with the National Day holiday. The popularity reached its highest value on October 2, which corresponded to the large-scale tourist detention event on that day. The daily variation in the August anomaly in 2017 with the highest peak (from August 8-11) in Fig 4(b2) coincided with the period of the 7.0-magnitude earthquake in Jiuzhaigou County on August 8, indicating that the disaster event was the main factor leading to the increase in popularity in Jiuzhaigou during this period.

Therefore, the popularity calculation method proposed in this paper can be used to identify periods of unusual attention from tourists that are coupled with holidays, special policies, tourism events and sudden disasters, thus providing a quantitative explanation for these abnormal phenomena.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to the full revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 5:

One of my biggest concerns of this research is that the authors did not integrate spatial and temporal scales into one systematic model. As demonstrated in approach section and case study section, the approaches are dealing with the spatial scale and then the temporal scale separately. To me, this research is not proposing a multi-spatiotemporal approach, but rather a multi-spatial-scale and multi-temporal-scale approach. I would be intrigued to see spatial and temporal scales truly integrated in one granular computing model, which will make a great contribution to GIScience.

Revision:

Thank you very much for your constructive comments. Your concerns regarding this research about integrating spatial and temporal scales into one systematic model is the core of TDPMTGC. Your comments played an important role in improving the model. Thank you for your constructive suggestions.

In response, we provide a brief introduction in the Abstract and Instruction. We also modified Section 3.2.2 to explain how the granular computing model integrates spatial and temporal scales. Finally, in the case study section (Section 5.2 ‘Tourist destination popularity mining at multi-spatiotemporal scales’), we added an experimental verification of the system model.

(1) Abstract and Instruction

To accurately granulate the spatial and temporal information of tourism text, tourism text data granules are used to represent landscape objects. These granules are unified objects that possess multiple attributes, such as spatial and temporal dimensions. The multi-spatiotemporal scales are characterized by the multi-hierarchical structure of granular computing, and transformations of granular layers and data granule size are achieved by scale selection in the spatial and temporal dimensions. Therefore, all scales between the spatial and temporal dimension are related, which allows for the comparability of the data granules of all spatial-spatial, temporal-temporal and spatial-temporal layers. This approach achieves a quantitative description and comparison of the popularity value of granules between adjacent scales and cross-scales. Therefore, the TDP with multi-spatiotemporal scales can be deduced and calculated in a systematic framework.

(2) Section 3.2.2.

First, by defining the structure of a "tourism text data granule", we show how the spatial and temporal dimensions are integrated into a single systematic model as attributes of the data granules. A tourism text data granule is a complete entity with multiple attributes, such as space and time, which must be described from spatial, temporal and other dimensions. Among them, both the spatial and temporal dimensions contain multiple scales; thus, the multi-scale structure of granules corresponds to these multi-spatiotemporal scales (see Fig 1(a)). Using this approach, the time and space dimensions are integrated into a single systematic model reflected as attributes of data granules. To describe the spatiotemporal characteristics of the data granules, it is necessary to clearly indicate their spatiotemporal scale, which can be divided into the following situations: ① To describe the characteristics of data granules at a particular spatiotemporal scale, it is necessary to fix the spatial and temporal scales of the granules (see Fig 1(b)); ② To describe the characteristics of data granules at a specific spatial (or temporal) scale, it is necessary to fix the spatial (or temporal) scale of the granules and mine the evolution rules of granules at that multi-temporal (or multi-spatial) scale (see Fig 1(c) and 1(d)); and ③ To describe the characteristics of data granules at multi-spatiotemporal scales, multiple scales of the spatial and temporal dimensions of the granules should be selected to perform comprehensive mining (see Fig 1(e)).

Then, we describe the implementation of the multi-spatiotemporal scale granular structure. The multi-spatiotemporal scale granular structure of tourism text data is represented by the complete graph shown in Fig 1(a), in which layers of the multi-spatial granular structure correspond to the scales. The data granules in the upper scale are transformed into those in the lower scale using the granulation criteria . The data granules decrease as the scale decreases. Similarly, layers of the multi-temporal granular structure correspond to the scales, and granules in the upper scale are transformed into those in the lower scale using the granulation criteria . A complete graph represents the existence of an edge (i.e., a correlation) between any spatial-spatial, temporal-temporal, or spatial-temporal scales. There are edges among the spatial-spatial scales, edges among the temporal-temporal scales, and edges among the spatial-temporal scales; thus, the total number of edges is . The correlation between temporal scales is presupposed by the "spatial-temporal" correlation (i.e., the correlation between two temporal scales ‘ — ’ for a spatial scale is obtained by granulating in layers and , which yields the correlations ‘ — ’ and ‘ — ’). The granular structure of tourism text data can be used not only to mine features of small-scale landscapes (where represents a tourist destination) over a short period (such as when represents an annual scale) but also to mine the life cycle evolutionary laws at large scales (where represents a national or even a global scale) over long periods (such as when represents several centuries (if the data are available)). According to the actual needs, subgraphs can be extracted from Fig 1(a) to achieve landscape law mining at a single-space/single-time scale (see Fig 1(b)), single-space/multiple-time scales (see Fig 1(c)), multiple-space/single-time scales (see Fig 1(d)), and multiple-space/multiple-time scales (see Fig 1(e)).

In conclusion, a tourism text data granule is a unified whole possessing multiple attributes, such as a spatial and a temporal dimension. The transformations of granular layers and data granule size are achieved by scale selection in both the spatial and temporal dimensions. Therefore, all the scales between spatial and temporal dimension are related, which allows for the comparability of the data granules of all spatial-spatial, temporal-temporal and spatial-temporal layers. This approach allows for comparisons of the popularity value of data granules both among adjacent scales and across scales, forming unique information that we can gain by applying TDPMTGC to texts that cannot be obtained via other, possibly simpler, approaches (e.g., simply counting the number of visitors or the number of social media posts). We can analyze the geographic spatiotemporal relations among the multiple granular layers using the granular structure . Thus, is a useful tool for finely describing the multi-spatiotemporal patterns of tourist destination popularity.

(3) In the case study part.

Fig 3 is taken as an example to demonstrate the systematic model from three aspects, namely, ‘The spatiotemporal model associated with scenic areas’, ‘The spatiotemporal model associated with tourist routes’, and ‘The spatiotemporal model associated with scenic spots’. Finally, we conclude that TDPMTGC makes the data granules of all spatial-spatial, temporal-temporal and spatial-temporal layers comparable and then achieves the comparison of popularity values of data granules between adjacent scales and across scales. Detailed and quantitative descriptions of tourist destination popularity at multi-spatiotemporal scales are helpful for comprehensively and deeply exploring the spatiotemporal characteristics of tourism from the viewpoint of tourists' cognition.

Please refer to lines 29–38 on page 2, lines 101–112 on page 5, lines 253–258 and 267–315 on pages 11–12, and lines 595–798 on pages 26–36 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Reviewer #2:

Comment 1:

Weak point:

- The added value of TAMTGC is unclear. In other words, what unique information can we gain by applying TAMTGC to texts, which cannot be obtained via other possibly simpler approaches? For example, in lines 51-52, the authors wrote: "Tourism attraction can be expressed using the number of visitors [2-4], the index related to online search and evaluation, and the User Generated Content (UGC) published by tourists [5-7]." So what unique and additional information can we gain using TAMTGC compared with e.g., using simply the number of visitors or the number of social media posts? To address this, the authors may need to do two things. First, the authors may need to enrich the introduction section to clarify the unique information obtained by TAMTGC. Second, the authors may need to add some comparisons in their case study of Jiuzhaigou to show the additional information that can be obtained by TAMTGC.

Revision:

Thank you very much for your constructive comment.

The values added by TDPMTGC are described in 4 parts of the paper.

(1) Abstract

To accurately granulate the spatial and temporal information of tourism text, tourism text data granules are used to represent landscape objects. These granules are unified objects that possess multiple attributes, such as spatial and temporal dimensions. The multi-spatiotemporal scales are characterized by the multi-hierarchical structure of granular computing, and transformations of granular layers and data granule size are achieved by scale selection in the spatial and temporal dimensions. Therefore, all scales between the spatial and temporal dimension are related, making the data granules of all spatial-spatial, temporal-temporal and spatial-temporal layers comparable. This approach achieves a quantitative description and comparison of the popularity value of granules between adjacent scales and cross-scales. Therefore, the TDP with multi-spatiotemporal scales can be deduced and calculated in a systematic framework.

(2) Introduction and Section 3.2.2

"... a tourism text data granule is used to represent a landscape object, which is a unified whole that possesses multiple attributes, such as spatial and temporal dimensions. The multi-spatiotemporal scales are characterized by the multi-hierarchical structure of GrC, and the transformations of granular layers and data granule size are realized by the scale selection in spatial and temporal dimensions. Therefore, all scales between the spatial and temporal dimension are related, which allows for the comparability of the data granules of all spatial-spatial, temporal-temporal and spatial-temporal layers. This approach achieves a quantitative description and comparison of the popularity value of granules between adjacent scales and cross-scales. Therefore, the tourist destination popularity with multi-spatiotemporal scales can be calculated in a systematic framework. Thus, we can gain unique information by applying TDPMTGC to texts that cannot be obtained via other, possibly simpler, approaches (e.g., simply counting the number of visitors or the number of social media posts)".

(3) A case study from Jiuzhaigou:

In Section 5.2 of this paper, we describe in detail how the TDPMTGC method can achieve quantitative comparisons of the popularity values of data granules between adjacent scales and across scales, meaning that the unique information obtained by TDPMTGC can be compared with that of other methods.

Finally, we could draw a conclusion that TDPMTGC makes the data granules of all spatial-spatial, temporal-temporal and spatial-temporal layers comparable and then achieves the comparison of popularity values of data granules between adjacent scales and across scales. Detailed and quantitative descriptions of tourist destination popularity at multi-spatiotemporal scales are helpful for comprehensively and deeply exploring the spatiotemporal characteristics of tourism from the viewpoint of tourists' cognition

(4) Discussion

In the Discussion, we compare TDPMTGC with 3 approaches in the paper, including the one mentioned by the reviewer, and the advantages of our method and the unique information it offers are illustrated.

First, this paper compares three relevant research approaches to illustrate the inheritance and further innovation of TDPMTGC based on existing approaches and how it will lead to further research.

TDPMTGC has good adaptability to spatiotemporal scales and types of tourist destinations. In terms of scale design, the granulation criteria of each layer are independent. The data in the upper scale are mapped to the data in lower scales through granulation criteria between each layer. Making changes in the granular layers and scale requires changing only the granulation criteria between the affected adjacent granular layers, which will not affect other granular layers. Therefore, the number of spatiotemporal scales can be adjusted dynamically based on the scale and development characteristics of tourist destinations when using TDPMTGC. For example, some tourist destinations, such as ancient cities, have no tourist routes; thus the spatial scales could be simplified and the tourist route layer could be deleted. TDPMTGC is applicable to tourist destinations with different types and themes, for example, nature and humanity, which can facilitate comparative studies involving different types of tourist destinations.

TDPMTGC can be adapted to dynamic changes in the data. The granular structure of tourism text data supports the expansion of dynamic incremental data in a specific granular layer without affecting other layers. TDPMTGC can dynamically calculate tourist destination popularity corresponding to the varying granular layers and achieve real-time monitoring of tourist destination popularity at multi-spatiotemporal scales.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to lines 29–38 on page 2, lines 101–112 on page 5, lines 305–315 on page 13, lines 595–798 on pages 26–36 and lines 850–864 on page 38 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 2:

- Lines 82-85: The authors may consider also discussing the following related paper on analyzing UGC for discovering interesting zones.

Hu, Y., Gao, S., Janowicz, K., Yu, B., Li, W., & Prasad, S. (2015): Extracting and understanding urban areas of interest using geotagged photos, Computers, Environment and Urban Systems, 54, 240-254.

Revision:

Thank you very much for your constructive comment.

In the discussion section, we compare TDPMTGC with three related research approaches, including one mentioned above.

Previous tourist destination popularity research approaches have important implications for this paper. We take the approaches of Hu [41], Wang [5], Tang [40] as examples and compare them with the TDPMTGC proposed in this paper to both acknowledge the inheritance TDPMTGC owes to the existing approaches as well as its further innovations and to reflect its potential advantages and value in future applications, leading to further research questions.

(1) Dataset. Before the advent of the big data era, questionnaires represented the main method of obtaining user data (e.g., Tang et al [40]). However, the rapid development of the Internet has caused the data scale to explode. Increasingly, scholars focus on mining social media data, such as Flickr photos and microblog data (e.g., Hu et al [41] and Wang et al [5]). TDPMTGC uses the full content of tourism UGC texts, which contain rich spatiotemporal and semantic information that is conducive to in-depth explorations of the rules governing tourists' spatiotemporal behaviors and analysis of the driving mechanisms of tourism spatial patterns and processes. This approach better reflects users' real emotional trends than does data collected based on specific research objectives, such as questionnaire surveys and interviews, and it reduces the differences caused by sparse or inconsistent samples. For example, analyzing the variation tendency of popularity of Jiuzhaigou at the daily scale, we find that the unusual period of attention by tourists is associated with holidays, special policies, tourism events and sudden disasters. The feature extraction of tourism UGC text from an abnormal time period can be used to analyze users' emotional trends. One advantage of TDPMTGC is that the data types it can use are unrestricted. Although we chose text for this study, other types of data could also be employed, and we plan to conduct further research using Flickr photos.

(2) Methodology. Hu et al [41] designed a three-layer framework to extract areas of interest (AOIs) from geotagged photos to understand the spatiotemporal dynamics of these areas. Tang et al [40] constructed a model of tourists' sense of place and studied their perceptions and evaluations of tourist destinations from four dimensions: natural scenery, social cultural setting, tourism function, and affectional attachment. Wang et al [5] used the kernel density estimation (KDE) algorithm to analyze tourists' attention to the landscape at multi-spatiotemporal scales. Most of the existing methods have regarded a tourist destination as an integral spatial unit for studying evolutionary rules at multi-temporal scales. While others consider multi-spatiotemporal scales, there is no correlation concerning the values between scales, which affects the accuracy of these approaches. Inspired by the existing methods, TDPMTGC fully considers the spatiotemporal scale characteristics of big data. Tourism text data granules are used to represent landscape objects in tourism geography, the multi-spatiotemporal scales in tourism GIScience are depicted by the multi-hierarchical structure of GrC, and the spatial and temporal dimensions are integrated into a systematic framework as attributes of the data granules. In this way, quantitative calculations of multi-spatiotemporal scales and popularity deduction between adjacent scales and across scales can be achieved. The potential advantages and values of this approach will be reflected by the following aspects in future applications.

① TDPMTGC has good semantic scalability. UGC data are granularized and reorganized based on spatiotemporal scales to form text data granules with clear spatiotemporal semantics. Moreover, the granulation criteria can be extended to geography or to other thematic semantics, such as tourism emotion, sightseeing, consumption behaviors and service perceptions. Thus, this approach can not only quantitatively calculate tourist spatial popularity but can also be combined with other methods for studying tourist spatiotemporal behaviors, landscape preferences, and spatial images. TDPMTGC has a wide range of applications and can be used to support different research goals in tourism, geography or other fields of humanities and social sciences.

② TDPMTGC has good adaptability to spatiotemporal scales and types of tourist destinations. In terms of scale design, the granulation criteria of each layer are independent. The data in the upper scale are mapped to the data in lower scales through granulation criteria between each layer. Making changes in the granular layers and scale requires changing only the granulation criteria between the affected adjacent granular layers, which will not affect other granular layers. Therefore, the number of spatiotemporal scales can be adjusted dynamically based on the scale and development characteristics of tourist destinations when using TDPMTGC. For example, some tourist destinations, such as ancient cities, have no tourist routes; thus, the spatial scales could be simplified, and the tourist route layer could be deleted. TDPMTGC is applicable to tourist destinations with different types and themes, for example, nature and humanity, which can facilitate comparative studies involving different types of tourist destinations.

③ TDPMTGC can be adapted to dynamic changes in the data. The granular structure of tourism text data supports the expansion of dynamic incremental data in a specific granular layer without affecting other layers. TDPMTGC can dynamically calculate tourist destination popularity corresponding to the varying granular layers and achieve real-time monitoring of tourist destination popularity at multi-spatiotemporal scales.

(3) Experimental results. By comparing the AOI growth model, Hu et al [41] found that AOIs in developed cities have large initial areas but slow development speeds, while AOIs in rapidly developing cities have low initial values but significant growth rates. Tang et al [40] found that the natural landscape of Jiuzhaigou has received high perception evaluation scores and presents good general recognition by tourists, while the perception evaluation scores of its social and cultural environment are relatively low. Wang et al [5] discovered popularity routes and scenic spots in Jiuzhaigou by mining the spatial pattern and evolutionary processes of tourists' attention at multi-spatiotemporal scales. TDPMTGC not only obtained conclusions consistent with these previous results but also revealed detailed features of tourist destination popularity that were not described in previous studies because it allows a quantitative analysis of the driving forces of tourism phenomena. These results suggest that TDPMTGC has better precision and quantitative and cross-scale calculation and deduction abilities compared with previous approaches.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to lines 804–876 on pages 36–39 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 3:

- Line 468: "In total, we collected >100,000" It would be better to use the exact number of posts here.

Revision:

Thank you very much for your constructive comment.

The precision of the paper was improved by using more accurate numbers. We revised ‘>100,000’ to the exact number of posts, i.e., 105,226.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to line 590 on page 26 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 4:

- Is the dataset used in the case study all Sina microblog posts published during this period in the study area or only a sample? Please clarify.

Revision:

Thank you very much for your constructive comment.

We apologize for being unclear in our description of the dataset in the original paper and possibly confusing readers. We have clarified the dataset in the revised manuscript as follows: In total, we collected 105,226 microblog posts from 2013 to 2017, which constitutes all the Sina microblog posts published during this period regarding Jiuzhaigou. By filtering noise data (the number of noise data is 68,486), we obtained 36,740 valid tourism text entries (see Table 1) that constitute the dataset.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to lines 589–593 on page 26 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 5:

- Table 5 has too much information and is overwhelming. Maybe the authors can highlight some values with bold font.

Revision:

Thank you very much for your constructive comment.

We agree that tables with too much information are overwhelming without bold font and will confuse readers. In this revision, we used bold font and underlined text to highlight the three levels of popularity spots. Three scenic spots in the first level have the highest popularity: Wucaichi and Changhai in Zechawagou and Wuhuahai in Rizegou (in bold underlined font). The scenic spots in the second level with high popularity are concentrated in Rizegou, including the 7 scenic spots of Zhenzhutanpubu, Jianzhuhai, Nuorilangpubu, Xiongmaohai, Yuanshisenlin, Jinghai and Zhenzhutan (in bold font). The scenic spots in Shuzhenggou are ranked only at the third level and include Luweihai, Huohuahai, Shuzhengzhai, Laohuhai and Xiniuhai (underlined font).

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to line 611 on pages 27–28 and lines 701–707 on page 32 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 6:

- Figure 4: Would the temporal variation of the attraction be similar or different from the numbers of microblog posts in the same time period? The authors may need to provide a comparison and discussion here.

Revision:

Thank you very much for your constructive comment.

To explain this problem clearly, we added Section 5.2.4 "The relationship between popularity variation tendency and the numbers of microblog posts" for comparison and discussion.

The temporal variation of the TDP is calculated based on the numbers of microblog posts during the same time period. The two variations are similar but not identical, and there are three main differences.

(1) Source data and reorganized data. The temporal variations in TDP as calculated by TDPMTGC are based on the text dataset after data reorganization rather than on the source data of microblog posts during the same period of the research area. Taking the data in 2017 as an example, 20,764 pieces of source data were focused on Jiuzhaigou in 2017, although this number was reduced to 7,277 after data reorganization. A comparison of the daily variation patterns within months (see Fig 4(d1)) showed that their overall trend was consistent and both were affected by the earthquake in Jiuzhaigou on August 8. However, the source data contain texts that are unrelated to the research area; thus, the variations are not exactly the same.

(2) Intersections between data granules. Intersections occur between data granules at some spatial scales, and the intersecting parts of the text belong to multiple granules. When calculating the comprehensive popularity of data granules, it is necessary to include the intersecting parts of the text in multiple granules at the same time, resulting in a text expansion compared with the source data, and these variations are slightly different from the changing trends in the number of microblog posts. For example, the route granules at the tourist route scale include a single spot, one route with multiple spots, multiple routes with multiple spots, single route and multiple routes, among which multiple routes with multiple spots and multiple routes granules simultaneously belong to multiple route granules. Therefore, the absolute number of routes is slightly different from the overall number of microblog posts (see Fig 4(d2)).

(3) The popularity value of the same data granules can be different at different scales. For example, the popularity of Wucaichi at the scenic spot scale is 18.91% as calculated based on scenic spots, while its comprehensive popularity in the scenic area is 5.85% (see Table 5). Moreover, due to the different inclusion relationships of data granules at different scales, there is not necessarily a proportional relationship between the popularity values (i.e., multiple routes with multiple spots granules belong to multiple tourist route granules at the tourist route scale but only to one granule in the scenic area scale and thus are calculated differently on different scales).

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to lines 763–792 on pages 34–35 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 7:

- Lines 582-583: "TAMTGC can use the full volume of the tourism UGC texts, which can better reflect users' real emotional trends"? Could the authors provide some explanation on "emotional trends" and how TAMTGC can help discover these emotional trends?

Revision:

Thank you very much for your constructive comment.

We only briefly mentioned in the discussion section of the article that "TDPMTGC can use the full volume of the tourism UGC texts, which can better reflect users' real emotional trends" and did not provide a further explanation, which was an oversight. Therefore, we added a more complete explanation of this potential advantage. ‘For example, analyzing the variation tendency of popularity of Jiuzhaigou at the daily scale, we find that the unusual period of attention by tourists is associated with holidays, special policies, tourism events and sudden disasters. The feature extraction of tourism UGC text from an abnormal time period can be used to analyze users' emotional trends.’

This approach will be the next step in our work and is currently already under study. We expect that the final results will also echo this paper well.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to lines 818–821 on page 36–37 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Reviewer #3:

Comment 1:

It is better to add a new section “Literature Review” or “Existing Work” to summary previous research on related method of semantic knowledge discovery in GIScience, related spatiotemporal data mining method, tourism attraction analysis, granular computing model, etc. And then reorganize the section of Introduction.

Revision:

Thank you very much for your constructive comment.

In this revision, we added a "Literature Review" section that includes information on semantic knowledge discovery in GIScience, spatiotemporal data mining methods, tourism attraction analysis, and granular computing model. We also modified the introduction appropriately.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to lines 128–191 on pages 6–8 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 2:

The paper claims 5 aspects of contributions in introduction section. In my opinion, some contributions are not significant enough. For example, the 4th item “TAMTGC is extensible” cannot be thought of as a contribution. And Item 1 and 2 can be combined to illustrate the contribution of TAMTGC. Item 5 should be modified to claim the TAMTGC model was successfully applied in Jiuzhaigou area to obtain some new insightful research conclusion of tourist attractions in this area.

Revision:

Thank you very much for your constructive comment.

We have modified this part as follows.

The main contributions of this paper to tourism GIScience are as follows. (1) We introduce the granular computing (GrC) model into tourism geography through the TDPMTGC algorithm, which constructs a quantitative model of tourist destination popularity (TDP) at multi-spatiotemporal scales based on GrC using the inclusion degree. The proposed TDPMTGC can describe the TDP at a single spatial or temporal scale as well as the patterns and processes of TDP at multi-spatiotemporal scales. (2) A dataset construction approach for the text GrC model is proposed to provide a feasible scheme for reorganizing large-scale unstructured text and constructing public spatiotemporal UGC tourism datasets. (3) The TDPMTGC model was successfully applied in the Jiuzhaigou area, resulting in some new insightful conclusions regarding TDP in this area. TDPMTGC provides a new data mining approach for exploring tourist behaviors and analyzing the driving mechanisms of tourism patterns and processes both spatially and temporally.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to lines 113–123 on page 5 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 3:

In Section 3, some formulas are very long and not very readable. Especially, in some sentences, some formulas have to be inserted, which makes readers confusing. For example, “the total number of A, B and C of 49 scenic spots”, and “the attraction of A of Zhenzhutanpubu is XXXXX”. A suggestion is, the authors can replace some formulas with simple symbols (use letters A, B, C, or use simple words), and use these simple symbols in sentences when complex formulas have to appear.

Revision:

Thank you very much for your constructive comment.

We agree that the formulas that were inserted in some sentences might confuse readers; therefore, we have substituted some simple symbols to replace these formulas in sentences. The upper right corner indicates the scale. When these letters appear in the text, the corresponding formula is used to calculate the popularity value. The modifications are as follows (the subsequent material are selected excerpts from Section 4.1.1 and 4.2.1):

(1) Scenic spot scale : a single spot, one route with multiple spots, multiple routes with multiple spots, expressed as , and , for simplicity, we use A4, B4, and C4 instead of , and , respectively, in the following passage. The calculation formulas are as follows:

A4: ,

B4: ,

C4: .

(2) Tourist route scale : we now use A3, B3, C3, D3 and E3 instead of , , , and , respectively. The calculation formulas are as follows:

A3: ,

B3: ,

C3: ,

D3: , and

E3: .

(3) Scenic area scale: we now use A2, B2, C2, D2 and E2 instead of , , , , and , respectively. The calculation formulas are as follows:

A2: ,

B2: ,

C2: ,

D2: , and

E2: .

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to lines 415–441 on pages 18–19 and lines 488–565 on pages 21–25 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 4:

In Section 4.2, the result of spatial scale is described using table including different place names as rows. It would be better to use maps to obtain better result visualization effects. Especially, most readers are not familiar with where Jiuzhaigou is, and where the locations of different tourist spots are. So a map of Jiuzhaigou describing locations of different travel spots could be helpful.

Revision:

Thank you very much for your constructive comment.

We apologize for ignoring the fact that most readers may not be familiar with the location of Jiuzhaigou or the locations of the different mentioned tourist spots. Therefore, we replaced Fig 3 with Fig 5 in Section 5.2 with the spatial distribution of Jiuzhaigou. In Fig 5, we divided the scenic spots into four levels according to their popularity value and marked the distribution of the popularity level of each scenic spot on the corresponding position of the route to which it belongs. Readers can easily find that scenic spots at different levels of spatiotemporal scale show different popularity distribution rules. This map not only includes the distribution rules for the tourist destination popularity of scenic spots within the route as shown in Fig 5, although it also includes the distribution location of each scenic spot, allowing readers to obtain the information more intuitively.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to 620-622 on page 28 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 5:

From Table 2-5, it could be found that most calculation results are VERY small between 0.0000 and 0.0100. Can the authors consider some data normalization method, to normalize the intermediate data and final results to a value between 0.0 and 1.0, or a tourist attraction score between 0.0 and 100.0?

Revision:

Thank you very much for your constructive comment.

The difference in landscape popularity (i.e., the proportion of toponym text in the scenic area scale (93.33%) is much higher than that of nontoponym text (6.67%)) and the excessive number of landscape features (i.e., the number of scenic spots is 49, which leads to a smaller popularity value) resulted in popularity values between 0.0000 and 0.0100. In this revision, we normalized the numbers in Table 2-5 to keep them with the range 0-100%.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to lines 608–611 on pages 27–28 of the revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Comment 6:

Some word and grammar errors can be found. There is a logic error in the FIRST sentence of this paper. It should be “tourism GIScience mainly studies a series of basic problems in XXXXX …” During my review of this paper, more than 10 grammar errors were found, including tense inconsistency and preposition errors. In addition, “multi-spatiotemporal” should be used instead of “multi-spatiotemporal”. When using “multi” with other nouns, there should always be a “-” between them.

Revision:

Thank you very much for your constructive comment.

We apologize for having made so many mistakes in writing the original paper. We have checked the entire text carefully and corrected the errors in the text (the modified part is marked with red font). If there are any more problems, please do not hesitate to let us know and we will correct them in time. We appreciate your help and support.

Your comments played an important role in improving the manuscript. Thank you for your constructive suggestions.

Please refer to the full revised ‘Manuscript’ and ‘Revised Manuscript with Track Changes’.

Attachment

Submitted filename: Response to Reviewers.doc

Decision Letter 1

Song Gao

9 Jan 2020

Measuring multi-spatiotemporal scale tourist destination popularity based on text granular computing

PONE-D-19-21928R1

Dear Dr. Renjie,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

Professor Song Gao, Ph.D.

Academic Editor

PLOS ONE

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: (No Response)

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have sufficiently addressed my comments. The current spatiotemporal model is more scientifically sound than the original one.

Reviewer #2: (No Response)

Reviewer #3: I am glad to see the authors revised the paper according to my reviews carefully and seriously. Language and grammar problems have been edited. A map of research area has been added, and a single section of literature review was added with sufficient work. The formulas were more readable, and the authors also used some data normalization (percentage) to their analysis results. The biggest highlight of this paper is in data analysis section, where the authors provided detail information to analyze their results so the contribution of this paper could be highlighted. Spatial and temporal scale have been also combined in data analytics with sufficient details. For the general quality of this paper, I recommend that this paper could be accepted.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: Yes: Kejin Cui

Acceptance letter

Song Gao

18 Feb 2020

PONE-D-19-21928R1

Measuring multi-spatiotemporal scale tourist destination popularity based on text granular computing

Dear Dr. Renjie:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Song Gao

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. Supporting document for the use of dataset.

    (PDF)

    Attachment

    Submitted filename: Response to Reviewers.doc

    Data Availability Statement

    Data were purchased from Beijing Weimengkechuang network technology co. LTD (北京微梦科创网络技术有限公司), which owns the commercial Sina microblog. The authors confirm that interested researchers can replicate their study findings in their entirety by directly obtaining the data from the third-party and following the protocol in our Methods section. Other researchers would be able to access the data set in the same manner as the authors, and the authors did not have any special access privileges that others would not have. The authors provide the following information about Sina microblog: located at Sina headquarters building, building 8, west district, no.10 Xibeiwang East Road, Haidian district, Beijing (北京市海淀区西北旺东路10号院西区8号楼新浪总部大厦); URL: https://open.weibo.com/wiki/C/2/place/nearby_timeline/biz.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES