Skip to main content
PLOS One logoLink to PLOS One
. 2021 Apr 15;16(4):e0250106. doi: 10.1371/journal.pone.0250106

Effects of raster terrain representation on GIS shortest path analysis

F Antonio Medrano 1,*
Editor: Timothy C Matisziw2
PMCID: PMC8049281  PMID: 33857239

Abstract

Spatial analysis extracts meaning and insights from spatially referenced data, where the results are highly dependent on the quality of the data used and the manipulations on the data when preparing it for analysis. Users should understand the impacts that data representations may have on their results in order to prevent distortions in their outcomes. We study the consequences of two common data preparations when locating a linear feature performing shortest path analysis on raster terrain data: 1) the connectivity of the network generated by connecting raster cells to their neighbors, and 2) the range of the attribute scale for assigning costs. Such analysis is commonly used to locate transmission lines, where the results could have major implications on project cost and its environmental impact. Experiments in solving biobjective shortest paths show that results are highly dependent on the parameters of the data representations, with exceedingly variable results based on the choices made in reclassifying attributes and generating networks from the raster. Based on these outcomes, we outline recommendations for ensuring geographic information system (GIS) data representations maintain analysis results that are accurate and unbiased.

Introduction

Spatial analysis is used to bring meaning and insights out of spatially referenced data, and the set of methods that are identified as spatial analysis tend to be some of the most heavily used in geographic information system (GIS) software [1]. As with any sort of analysis, the results from spatial analysis are highly dependent on the quality of the data provided, as well as the understanding that the GIS user has with respect to the methods used. A GIS user must almost always prepare and manipulate spatial data in order to make it suitable for use in analysis, and thus it is imperative that the user understand the impacts that these manipulations may have on the final results. Otherwise, the outcome of a spatial analysis may inadvertently be distorted. While misinformation through cartographic manipulations have been well documented [2, 3], if the GIS user has a desired outcome from the analysis they may even use data manipulations to covertly drive the solutions toward a desired goal. Thus, it is important to be aware of the effects of spatial data representation, and to establish guidelines that help to ensure that GIS analyses accurately represent real-world conditions and provide impartial solutions.

While GIS analysis techniques are numerous and broad, and an entire book could be written covering all impacts of data representation; the main objective of this article is to focus on the impacts of two common data transformations when representing terrain as a raster network for locating a linear feature using shortest path analysis: 1) defining the network generated by connecting raster cells to their neighbors, and 2) the range of the attribute scale that represents the costs to locate the feature at each raster cell. Raster-based shortest path analysis is the predominantly used method for locating linear features over terrain, such as new transmission line corridors [413], pipelines [1416], roadways [17, 18], as well as analyzing the connectivity of a landscape for habitat analysis [1923] and urban systems [24]. These applications typically require generating a set of non-inferior options that balance numerous competing interests such as economic cost, environmental impact, maintenance accessibility, visual pollution, etc.; and from that set of options a decision-making entity can select the final route alignment. Multi-objective shortest path (MOSP) analysis is commonly used for generating such alternatives, since it finds the set of optimal trade-off solutions between multiple competing objectives, and thus can find compromise solutions to best satisfy various parties with different values and priorities [9]. MOSP analysis is valuable at highlighting the representation effects of network connectivity and attribute scale since it provides a rich set of path solutions from which to see the effects of varying parameters of the data representations. When using just two objectives, MOSP analysis is known as biobjective shortest path (BOSP) analysis.

This study examines the effects of raster connectivity and attribute scale via BOSP analysis, comparing the number of Pareto-optimal solutions, the layout of the paths in decision space, and the performance of the solutions in objective space where applicable. We look at the guidelines found in the literature on locating transmission line corridors, and see how their recommendations affect the quality of the analytic solutions. In the discussion and conclusion, we provide guidelines to ensure that such spatial analyses are performed with the appropriate modeling accuracy and objectivity.

Background

The world is infinitely complex and continuous, and modeling it exactly on a digital computer is not possible. Thus, representing space on a computer requires discretization of both space and attributes. In the context of corridor location, information on terrain is often generated from remote sensing digital imagery that consists of a regular grid of pixels, and thus raster is the natural representation of such spatial data. Spatial attribute information is categorized into one of four basic measurement levels: nominal, ordinal, interval, and ratio [25]. Some spatial analysis techniques can be performed on nominal and interval features, such as go/no-go suitability analysis, spatial overlay, or location set covering. But ratio-level data is required for shortest path analysis, location-allocation problems such as the p-Median problem [26, 27], or any other analysis that requires multiplying the attribute value by a distance. When a GIS analyst uses a dataset that contains nominal or ordinal level data such as landcover type, they will often have to perform a reclassification to convert it into ratio-scaled data. It is this conversion that can lead to erroneous or misleading results if the reclassification is performed carelessly, and in this article we examine the consequences of such inaccurate reclassification. The effects of representation on the results of spatial analysis have been a known problem with GIS for quite some time. Miller [28] observed that “Spatial analysis was mostly developed in an era when data was scarce and computational power was expensive. Consequently, traditional spatial analysis greatly simplifies its representations of geography”. As technology progresses, he suggests an ongoing “re-examination of geographic representation in spatial analysis”. Tong and Murray [29] point out that “it is well recognized that findings can be highly dependent on how space is abstracted and represented. This can be due to the way we partition or conceptualize space” and that “much research is needed to reduce or alleviate errors and uncertainties in abstracting geographical space”. This article addresses the impacts of these representation issues in the context of shortest path analysis, and provides recommendations for best practices to avoid such problems. Geospatial representation and its effects on analysis continues to be an active area of research. For example, Gaboardi and Folch [27] evaluated spatial network representation for allocating and connecting points on a network, and found this could have substantive effects on the results of a p-Median and p-Center location analysis.

Any shortest path computation requires a network upon which to find the least-cost route, as raster data sets do not fundamentally have a built-in network structure. Methods have been developed to convert a raster into a network by assuming that the center of each raster cell is a node and defining arcs as links that connect each cell to its neighboring cells. Each arc then has a cost function which is the distance-weighted sum of the attributes of cells the arc traverses, called the cost-distance function. If the arc has width then the area of the path that intersects a cell is typically used as a weight instead of the distance of the arc intersecting that cell [4, 30, 31]. Assuming zero width as is prevalent with most built-in GIS functionality, the objective cost for any path is thus the sum of these cost-distance weighted arcs that contiguously connect the origin and destination locations. A shortest path problem finds the path that minimizes this cost. In a multi-objective shortest path algorithm, each arc that makes up a path has multiple costs corresponding to each objective, and any path will have a set of objective scores where the scores represent the performance of the path with respect to each objective.

To convert a raster grid into a raster network, cells are most commonly connected to their neighbors according to a specified radius R (see Fig 1) [32]. R = 0 denotes connecting cells to their orthogonal neighbors (rook’s move), R = 1 denotes connecting cells to their orthogonal and diagonal neighbors (queen’s move), and R = 2 denotes additionally connecting cells via knight’s moves. In a knight’s move, the network arc spans two raster cells in one direction and one raster cell in the orthogonal direction, but it is considered a straight-line connection between the starting and ending points. The cost-distance for such an arc is a function attributes of the four raster cells it passes through multiplied by the total arc length [4]. The higher the radius used to generate the raster network, the less geometric distortion the network will have due to less restricted movement between cells; but this comes at the cost of a higher network density, resulting in longer computation times. Goodchild [32] calculated the worst-case geometric elongation for a shortest path when traversing a raster network of uniform cost, and found the R = 0 network imparts a 41.4% elongation error, the R = 1 network imparts an 8.2% elongation error, and the R = 2 network imparts a 2.79% elongation error. Higher radius values could be used to further reduce the elongation error, but Huber and Church [4] found that the R = 2 network provides the best trade-off between accuracy and computational burden on real geographic data. Elongation error is also encountered in the transportation literature as the route factor, defined as the ratio of the graph distance over the Euclidean distance between two points [33, 34].

Fig 1. Raster network connectivity (a) R = 0, (b) R = 1, (c) R = 2.

Fig 1

Huber and Church [4] demonstrated that different radius-defined networks may result in optimal paths that take very different routes. This affects not just path objective costs but also real-world engineering design decisions, and is something our study examines within the context of bi-objective shortest paths. They also discuss raster orientation error, in which the optimal path length and route may also be subject to error due to the orientation of the raster network relative to the underlying topography. These results were confirmed by Antikainen [35], who found that with center connected paths the use of larger neighborhoods always yields better paths with less orientation error, at the expense of moderate increases in processing time. They also proposed an alternate boundary-based raster connectivity scheme, although we have yet to see their approach adopted in any GIS software. Seegmiller and Shirabe [31] propose an interesting method where in regions of constant raster cost (such as dense forest or water features or deserts), they define a start and end point within the monotonous region, and perform linear interpolation between the two to generate a corridor. This method enables much greater flexibility for path directions, but it is limited to straight-line paths within monotonous regions. Other publications that have looked at additional sources of error in raster network shortest path analysis include Huber [36] and Hong and Murray [37]; who found that varying raster cell size can have major implications on the objective value and route of a shortest path.

A paper recently published by Schito and Moncecchi [38] uses a very interesting and promising approach to generate their connectivity graph. They generate a bespoke connectivity graph for the particular origin and destination they select, and use complex geometric decision rules to connect the nodes with arcs. Like the experiments in this article, their method was beyond the capabilities of any existing GIS software, and thus they had to program their own with Python. While they are unable to share their code due to non-disclosure agreements, the paths they generate are free of geometric distortions and were deemed highly satisfactory by their stakeholders. If their code is ever released publicly, it would certainly warrant an examination with the methods used in this paper.

Transmission line corridor location affects many nearby people, all with different concerns and priorities. A proposed design must consider all stakeholder interests, which oftentimes contain conflicting priorities. For example, a utility company may want to build the cheapest power lines taking a straight-line path, whereas environmentalists may want the route to divert around a sensitive habitat. Multiple objectives are commonly encountered in such contentious public projects where the interests of diverse stakeholders must be considered when developing a set of alternatives for debate and decision-making. This is especially true for transmission line location, since these are often considered undesirable to locate near humans or wildlife [3941]. These contentious design problems, affectionately known as wicked problems [42], may be subject to sneaky manipulations in order to guide the decision toward one party’s desired outcome. This study uses biobjective shortest paths to shed light on the subtle techniques that GIS practitioners may use to accomplish such a desired outcome.

Because of the often-wicked nature of such problems, multi-objective optimization is commonly used for locating transmission lines. A multi-objective optimization problem entails finding the solutions that represent an optimal set of trade-off solutions between two or more objectives [24]. Aside from the methodologies we analyze, recent publications using multi-objective shortest paths to locate transmission lines over terrain raster data include [912]. All of these recent publications contain results with geometric distortions caused by the limitations on raster network connectivity in the GIS software they used, effects that we examine in this article.

Biobjective shortest path solutions are visualized and evaluated in both decision space and objective space (see Fig 2), where decision space is the real-world cartographic representation of the region where the path is being placed, and objective space depicts how that path performs with regards to each objective in comparison to other paths. Paths are linear features in decision space, and have corresponding point features in objective space (three paths are depicted in Fig 2a and the performance of those three paths are highlighted in Fig 2b). The set of non-dominated or Pareto-optimal solutions are those where there does not exist any other feasible solution that performs better in all objectives. These solutions form the trade-off frontier, which in the case presented in Fig 2 involves both minimizing cost and minimizing environmental impact.

Fig 2. Evaluating three paths in both (a) decision space, and (b) objective space.

Fig 2

Supported non-dominated solutions consist of the convex set of Pareto-optimal solutions and can be computed by solving single-objective problems combining the multiple-objectives via carefully selected weights [43]. Un-supported non-dominated solutions are the Pareto-optimal solutions that are not part of the convex frontier, and require specialized multi-objective algorithms to compute. Finding the set of all supported non-dominated solutions is computationally weakly polynomial, while computing the unsupported solutions is NP-Hard [44]. This study considers only the supported non-dominated solutions, as they provide a sufficiently rich solution set for demonstrating terrain raster representation errors with shortest path analysis.

Materials and methods

Data

The analysis in this study used GIS raster data sets assembled and used by the Eastern Interconnection States’ Planning Council (EISPC). These data sets are intended to facilitate the identification of potential energy sites and transmission line corridors within the EISPC region, which spans 39 eastern US states, Washington D.C. and 8 Canadian provinces. The data was assembled jointly by Argonne National Laboratory, Oak Ridge National Laboratory and the National Renewable Energy Laboratory as a part of their EISPC Energy Zones Study (EZS) [45].

The EZS data contains numerous geographical information layers that would be used in a suitability analysis for locating new energy infrastructure, and is available through the EISPC Energy Zones Mapping Tool (ezmt.anl.gov). As of December 2020, the EZS contained 332 data layers, including land cover type, slope, water bodies, watersheds, essential habitats, earthquake intensities, existing transmission lines, substations, rail and roadways, just to name a few. Our study uses a 1000×1000 raster subset of the EZS data, with a 250 meter cell size centered at 36.516° N, 88.687° W. The region analyzed is in the Kentucky Lake region where the Tennessee River and the Cumberland River intersect the Ohio River, and includes portions of Tennessee, Kentucky, Illinois and Missouri. All maps in this article were created using the EZS numerical data, rendered programmatically with Java and the Processing API. No GIS software was used, and all code and data used to generate the maps is contained in the public Github repository [46]. All maps are oriented with North as up, and at this scale all maps in this article have an extent of 250km × 250km.

The EZS raster data was used to create two cost surfaces for a bi-objective optimization, where the competing objectives were to minimize 1) the infrastructure construction cost, and 2) the environmental impact. Since these objectives are not explicitly in the EZS data set, it was necessary to derive ratio-scale cost surfaces from the available layers. The slope layer, in percent slope, was used to develop a construction cost surface. The land cover type layer, categorized according to the National Land Cover Database 2016 (NLCD2016) which was publicly released in May 2019 [47, 48], was used to generate an environmental impact cost surface. The slope and land category attributes were then converted to ratio-scaled cell costs according to the terrain cost multipliers recommended by the Western Electricity Coordinating Council (WECC) [7], listed in Table 1. In all experiments, all cost-surfaces were scaled to equal ranges between the two objectives.

Table 1. Attribute reclassification for fixed Cmin and varying amplitude.

NLCD2016Value NLCD2016 Feature WECC Feature WECCValue [1,2] [1,5] [1,10] [1,20] [1,50] [1,100]
11 open water n/a 3.250 2.000 5.000 10.000 20.000 50.000 100.000
21 developed, open space suburban 1.270 1.120 1.480 2.080 3.280 6.880 12.880
22 developed, low intensity suburban 1.270 1.120 1.480 2.080 3.280 6.880 12.880
23 developed, medium intensity urban 1.590 1.262 2.049 3.360 5.982 13.849 26.960
24 developed, high intensity urban 1.590 1.262 2.049 3.360 5.982 13.849 26.960
31 barren land (rock/sand/clay) scrub/flat 1.000 1.000 1.000 1.000 1.000 1.000 1.000
41 deciduous forest forested 2.250 1.556 3.222 6.000 11.556 28.222 56.000
42 evergreen forest forested 2.250 1.556 3.222 6.000 11.556 28.222 56.000
43 mixed forest forested 2.250 1.556 3.222 6.000 11.556 28.222 56.000
52 shrub/scrub scrub/flat 1.000 1.000 1.000 1.000 1.000 1.000 1.000
71 grassland/herbaceous scrub/flat 1.000 1.000 1.000 1.000 1.000 1.000 1.000
81 pasture/hay farmland 1.000 1.000 1.000 1.000 1.000 1.000 1.000
82 cultivated crops farmland 1.000 1.000 1.000 1.000 1.000 1.000 1.000
90 woody wetlands wetland 1.200 1.089 1.356 1.800 2.689 5.356 9.800
95 herbaceous wetlands wetland 1.200 1.089 1.356 1.800 2.689 5.356 9.800
Slope WECC Feature WECC Value [1,2] [1,5] [1,10] [1,20] [1,50] [1,100]
< 2% flat 1.000 1.000 1.000 1.000 1.000 1.000 1.000
2–8% rolling hill 1.300 1.600 3.400 6.400 12.400 30.400 60.400
> 8% mountain 1.500 2.000 5.000 10.000 20.000 50.000 100.000

Fig 3 graphically displays the EISPC data maps used in the analysis, represented as the raw 1000×1000 rasters for (a) land use type and (b) slope, and then reclassified as (c) environmental impact and (d) economic cost. All maps in Fig 3 were created using Java for this study, but Fig 3a uses the same colors as the NLCD2016 class legend (https://www.mrlc.gov/sites/default/files/NLCD_Colour_Classification_Update.jpg), and the other maps in Fig 3 use light colors represent low slope or cost, and dark colors to represent high slope or cost according to the classifications in Table 1. Note that in cost layers derived from the land cover layer (Fig 3c), rivers and lakes have a high cost because it is expensive to build over water. In costs derived from the slope layers (Fig 3d), water features have a low cost because water is represented as flat. In a real-world transmission line location analysis, water would likely have a high cost with respect to both environmental impact and monetary cost. The WECC classifications were developed as single-objective cost multipliers where a high cost in one measure would carry over to the overall composite cost. Rather than divert from the published WECC values, we chose to keep them since for all other attributes they provide a very good ratio-scaled mapping to objective costs and it does not affect the overall evaluation of terrain network representation parameters. But it is important to note that any real-world analysis should develop custom application-specific costs to map the attributes to the modeled objectives.

Fig 3. 1000×1000 EISPC raster data (a) land cover (b) slope (c) environmental impact (d) economic cost.

Fig 3

In (b) light color is less slope and dark color is more slope, and in (c) and (d) dark color is high cost and light color is low cost.

Algorithms

This analysis implemented the parallel bi-objective shortest path algorithm described in Medrano and Church [49] to compute the complete set of supported (convex) non-dominated path solutions using an origin at the lower-left corner of the raster region, and a destination at the top-right corner. This algorithm, called pNISE is a parallel implementation of the NISE algorithm commonly used to find the supported solutions of biobjective network optimization problems [23]. The algorithm is efficient at computing the Pareto-optimal path sets for biobjective shortest path problems of reasonably large graph size, which in the case of the R = 2 network contains 1 million nodes and approximately 16 million arcs. All code was written in Java, and visual results were rendered using the Processing API (processing.org). The reader is invited to download both the Java and the Processing codes from Github [46]. Coding a custom geospatial analysis tool rather than depending on existing GIS software allowed for exploring capabilities beyond those built-in to existing GIS tools. By evaluating if there are benefits to expanding how GIS software represents raster terrain as a network, we can make recommendations for features that should be added to GIS software.

Raster network connectivity

Huber and Church [4] previously examined the effects of network connectivity on single-objective shortest paths on fabricated data, finding that altering the connectivities resulted in differences in both path-objective performance and path topologies. In this section we perform a similar analysis with varying the network connectivity, and with biobjective shortest paths on the much larger EISPC real-world raster data. With the different connectivities we analyze in objective space the objective values of the Pareto-optimal path set in objective space and the number of paths that compose the complete convex Pareto-optimal path set. Qualitatively we compare the effects on path topologies in decision space, examining if the analyses exhibit different geometric distortions due to the parameters we vary when run on the same data. This multi-objective approach allows us to test the impacts of network connectivity on path delineation and performance on multiple network topologies via multiple weightings of the underlying raster layers, rather than just one single raster network used in previous studies.

Fig 4 displays the complete set of non-dominated paths with the South-West corner as the origin and the North-East corner as the destination, using networks with R = 0, 1, and 2. All use the WECC attributes scaled to [1, 10] for both objectives. What is immediately noticeable are the geometric artifacts for each type of network connectivity in the region west of the river with relatively constant cost in both objectives. In this region, the R = 0 paths have a strong tendency to traverse either vertically or to alternate between vertical and horizontal arcs in order to approximate a diagonal traversal. The alternating artifacts are a clear indicator of an orientation error of the type illustrated in Huber and Church [4]. The R = 1 paths display clear regions of vertical or diagonal routes, aligned with the restrictions imposed by the available arc directions. The R = 2 paths do sometimes align with knight’s move directions, but overall display the least amount of visible geometric distortion due to having the fewest route alignment restrictions. While discretizing continuous space will always result in some level of geometric distortion, it is clear that R = 2 connectivity dramatically reduces geometric distortion with minimal additional complexity as compared to the R = 1 connectivity that most GIS software currently uses.

Fig 4. Non-dominated solutions in decision space for (a) R = 0 (blue); (b) R = 1 (red); and (c) R = 2 (green).

Fig 4

Fig 5 shows the supported, Pareto-optimal paths for all three connectivities in objective space. The R = 0 Pareto set (blue) contains 88 distinct paths, the R = 1 Pareto set (red) contains 145 paths, and the R = 2 Pareto set (green) contains 270 paths. There are major differences in the location of the Pareto-frontiers in objective space: as the network connectivities increase the objective scores decrease, and are in agreement with the theory developed in Goodchild [32] and previous experiments in Huber and Church [4]. The most pronounced difference is in going from R = 0 to R = 1, but there is still a distinct difference also between R = 1 and R = 2. If using the objective values to calculate expected costs of multi-million-dollar projects, a three to four percent increase in path length can mean significant errors in the budgetary estimates of potential alternatives. Presumably in the interest of minimizing computation time to determine optimal routes with older hardware, common GIS software packages do not include R = 2 network connectivity as an option; this capability currently has to be manually scripted into the analysis. Since R = 2 shortest path computation has become trivial for most real-world data sets using modern computing hardware, GIS software makers absolutely should incorporate an option to generate R = 2 networks as a built-in functionality.

Fig 5. Non-dominated solutions in objective space for R = 0 (blue), R = 1 (red), and R = 2 (green).

Fig 5

Attribute scale classification

A GIS practitioner will often need to reclassify attributes in order to prepare spatial data for analysis. Any shortest path analysis requires ratio-scaled data since a path cost is calculated as the sum of products of the attribute values and distances; the calculation is not associative. In other words, one cannot add a constant value to the costs of all nodes in a raster network and expect to get the same shortest path result. By adding a constant value, the shortest path algorithm will then be biased more towards finding a path that minimizes the number of arcs rather than the path of combined least impact with respect to the objectives. Thus, if an analysis is to be performed on land cover type and slope raster raw data, but decision-makers are actually trying to measure environmental impact and economic cost for locating the feature, then the data requires an attribute value conversion. Past literature for transmission line location has used a variety of approaches for these conversions:

  1. The Georgia Transmission Corporation [5]
    • Scale all costs from 1 to 9 for all feature layers
    • No mention that scaling should reflect actual costs
  2. Bagli and Geneletti [6]
    • All costs scaled from 0 to 1
    • No mention that scaling should reflect actual costs
  3. Western Electricity Coordinating Council (WECC) [7]
    • Costs/mile for different kinds of transmission lines
    • Costs/acre to purchase or lease land
    • Cost multipliers for terrain type and slope
  4. Esri cost surface online tutorials [50, 51]
    • All costs scaled from 1 to 10
    • No mention that scaling should reflect actual costs

Approaches 1 and 2 were made for performing multi-objective shortest path analysis on raster terrain networks, and are problematic because they use arbitrary ranges for the cost values that have no real-world meaning for path performance. Approach 3 is intended for cost estimation of a potential route, and does use true ratio-scaled cost multipliers to reclassify data according to slope or land use. The output of this approach gives results in actual dollar values for each route analyzed. Approach 4 is intended as a how-to online tutorial and casually scales everything from 1 to 10, and makes no mention that attributes should in-fact be scaled to actual real-world costs.

The geospatial analyses in this article demonstrate what can go wrong when cost ranges are picked arbitrarily without any correlation to actual costs. The analysis in the next section varies the amplitude while maintaining the same minimum cost and relative classification breaks. For example, suppose you have to reclassify features that are deemed as low, medium and high environmental impact into ratio scaled data. One could assign a cost of 1 to low-impact cells, a cost of 2 to medium-impact cells, and a cost of 4 to high-impact cells. We define the minimum cost as Cmin, the maximum cost as Cmax, and shorthand for reclassification range as [Cmin, Cmax]. We define the amplitude as CmaxCmin. In the above example, Cmin = 1, Cmax = 4, the range is [1, 4], and the amplitude is 3. Now suppose we want to double the amplitude to 6 but maintain the same minimum cost, then low-impact would cost 1, medium-impact would cost 3, and high-impact would cost 7, i.e. [1, 7]. Overall, this is equivalent to marking the classification breaks on a rubber band, then anchoring the lower bound and stretching the upper bound, as shown in Fig 6a.

Fig 6. Attribute classification modification via (a) stretching, or (b) shifting.

Fig 6

The analysis in the following section varies the minimum cost for the reclassification while maintaining the same amplitude and relative classification breaks, as shown in Fig 6b. A [2, 5] shift would have a cost of 2 for low-impact, 3 for medium-impact, and 5 for high-impact, effectively adding 1 to every value as compared to a [1, 4] classification.

In both experiments, the same underlying data is used using the exact same relative interval proportions, while varying only the amplitudes or the minimum costs. In other words, all experiments use the same WECC feature costs for each land category or slope, but the costs are then stretched according to the amplitude or are shifted by the minimum cost value, as shown graphically in Fig 6. All classification experiments here use the same R = 2 network connectivity.

Both experiments yield results allowing for both quantitative and qualitative comparisons. When varying the range and amplitude of the raster attributes, one cannot directly compare the objective values of the results. The key analysis is the qualitative comparison of how variations in the attribute ranges affect the path topologies in decision space. The number of paths in the Pareto-optimal path set can be compared quantitatively as well. Combined, these two measures indicate how choices made in selecting the attribute ranges affect the character and diversity of the resulting optimal path alternatives.

Let us informally define the dynamic range of a reclassification scale as the following, as this is a useful measure to compare the effects of the reclassification schemes:

DynamicRange=maximumattributecostminimumattributecost=CmaxCmin (1)

In Fig 6a, the shorter bars represent reclassifications with small dynamic range, and the longer bars with large dynamic range. In Fig 6b, the reclassifications on the left (close to zero) will have large dynamic ranges due to the smaller denominator, and those on the right will have smaller dynamic ranges.

Reclassification: Varying the amplitude

This experiment maintained Cmin = 1 while changing the attribute scale amplitude. Attribute reclassification values are shown in Table 1, varying Cmax from 2 to 100. Land use features were used for one objective layer, and the slope was used for the other objective layer. All experiments used the equal ranges for the two objectives in order to maintain equal emphasis between them.

Fig 7 displays the results of this analysis in decision space. Low amplitude small dynamic range solutions tended toward straight paths that are approximated by a Euclidean shortest path, while high amplitude large dynamic range solutions tended to have greater deviations and spatial diversity. Recalling that arc costs are a product of the arc distance and the cell attribute values, it is clear that varying the ranges in this manner results in a trade-off between minimizing the spatial length of a shortest path, i.e. the Euclidean tendency, and the need to avoid cells with high cost attributes. As attribute amplitudes increase, the total number of non-dominated path solutions increase as well. This, too, is an indicator of the spatial vs. attribute trade-off, as the extreme and unrealistic case of a homogenous flat-cost geographic space would have a single non-dominated solution consisting of the Euclidean shortest path.

Fig 7. Decision space solutions for a constant Cmin while varying the amplitude to the following ranges: (a) [1,2]; (b) [1,5]; (c) [1,10]; (d) [1,20]; (e) [1,50]; and (f) [1,100].

Fig 7

Reclassification: Varying the minimum value

This experiment maintained a constant amplitude of 1 while varying the value of Cmin. Attribute reclassification values are shown in Table 2, varying Cmin from 0 to 5. The land use features were used for one objective layer, and the slope was used for the other objective layer. All experiments used the equal ranges for the two objectives in order to maintain equal emphasis between them.

Table 2. Attribute reclassification for fixed amplitude and varying Cmin.

NLCD2016Value NLCD2016 Feature WECC Feature WECC Value [0,1] [0.1,1.1] [0.2,1.2] [1,2] [2,3] [5,6]
11 open water n/a 3.250 1.000 1.100 1.200 2.000 3.000 6.000
21 developed, open space suburban 1.270 0.120 0.220 0.320 1.120 2.120 5.120
22 developed, low intensity suburban 1.270 0.120 0.220 0.320 1.120 2.120 5.120
23 developed, medium intensity urban 1.590 0.262 0.362 0.462 1.262 2.262 5.262
24 developed, high intensity urban 1.590 0.262 0.362 0.462 1.262 2.262 5.262
31 barren land (rock/sand/clay) scrub/flat 1.000 0.000 0.100 0.200 1.000 2.000 5.000
41 deciduous forest forested 2.250 0.556 0.656 0.756 1.556 2.556 5.556
42 evergreen forest forested 2.250 0.556 0.656 0.756 1.556 2.556 5.556
43 mixed forest forested 2.250 0.556 0.656 0.756 1.556 2.556 5.556
52 shrub/scrub scrub/flat 1.000 0.000 0.100 0.200 1.000 2.000 5.000
71 grassland/herbaceous scrub/flat 1.000 0.000 0.100 0.200 1.000 2.000 5.000
81 pasture/hay farmland 1.000 0.000 0.100 0.200 1.000 2.000 5.000
82 cultivated crops farmland 1.000 0.000 0.100 0.200 1.000 2.000 5.000
90 woody wetlands wetland 1.200 0.089 0.189 0.289 1.089 2.089 5.089
95 herbaceous wetlands wetland 1.200 0.089 0.189 0.289 1.089 2.089 5.089
Slope WECC Feature WECC Value [0,1] [0.1,1.1] [0.2,1.2] [1,2] [2,3] [5,6]
< 2% flat 1.000 0.000 0.100 0.200 1.000 2.000 5.000
2–8% rolling hill 1.300 0.600 0.700 0.800 1.600 2.600 5.600
> 8% mountain 1.500 1.000 1.100 1.200 2.000 3.000 6.000

Fig 8 displays the results of this analysis in decision space using the shifted attribute scales all with an amplitude of 1. What is immediately noticeable is the extreme behavior of the [0,1] range. In the western area with homogenous regions of zero cost, the paths appear to wander aimlessly in a sort of Brownian motion. The zero cost cell attributes mean that arc costs in this region are also zero, and there is no penalty for taking a long and windy path. Thus, the path generated by Dijkstra’s algorithm is subject to the pseudo-random motion that comes from the tie-breaking rules that were used in the particular implementation. In terms of dynamic range, zero costs represent a denominator of zero, which results in an undefined ratio. These model results are clearly unrealistic to any real-world behaviors, and as such, zero-cost attributes should always be avoided. Slightly above the [0,1] range, the solutions were diverse and mostly influenced by the attributes. This represents a large dynamic range due to the small denominator in the minimum attribute costs. As the ranges continue to shift higher, i.e. smaller dynamic range, the paths become more Euclidean as the arc costs gradually emphasize geometry over attribute values. The attribute cost values increase, but the relative dynamic range ratios between high and low costs become minimal. With regards to the number of solutions, the higher the range is shifted the fewer non-dominated solutions that exist; with the exception of the pathological [0,1] case, which had far fewer solutions than the slightly higher [0.1, 1.1] range.

Fig 8. Decision space solutions for a constant amplitude while varying Cmin to the following ranges: (a) [0,1]; (b) [0.1,1.1]; (c) [0.2,1.2]; (d) [1,2]; (e) [2,3]; and (f) [5,6].

Fig 8

Discussion: The impacts of dynamic range

The experiments here found that a lower dynamic range will bias results toward Euclidian shortest paths, with fewer and less-diverse solutions. A higher dynamic range will bias results that emphasize minimizing objective costs, with more solutions of greater spatial diversity. Even in regions that appear relatively homogenous on the map, if the selected attribute scale has a large dynamic range then the resulting paths will include large deviations to avoid regions of slightly higher cost (see Figs 7 and 8). And in the case where Cmin = 0 the dynamic range is undefined; and results showed that such a reclassification scheme to be problematic with path solutions that were random and unrealistic, and should be avoided at all costs. Because these results are derived from analysis on numerous raster terrain networks via the various weightings between the two objectives, as opposed to previous literature that only looked at one single-objective network, it is safe to say that these path deviation correlations with the dynamic range are more general than those from the previous literature.

While in this article we display the classification results for R = 2 networks only, we observed similar trends for R = 1 and R = 0 connectivities as well. The Github repository [46] contains a folder with screen captures for the analysis results of all combinations of R-values and classifications mentioned in this article, and we invite the reader to review them.

The variability of results as a function of the attribute scales is why it is imperative that reclassification costs are assigned as ratio-scaled values based on real-world metrics, and not arbitrary interval-scaled values. This might seem obvious, but for a spatially unaware user learning how to perform this analysis, three out of the four methodologies cited earlier make no mention of scaling to real-world costs and instead instruct users to scale to arbitrary cost ranges. Calculating ratio-scaled costs are simple for tangible expenses such as the economic cost to locate in a particular cell, and is done quite well in the WECC cost estimation guidelines. But for somewhat intangible costs such as environmental impact or maintenance accessibility the process is less clear. Questionnaires [52] or approaches like that of AHP [53] should be implemented to develop true ratio scales for such intangible costs, so that there is supporting evidence to justify the relative cost-ratios despite the subjective nature of the criteria.

Conclusions

Normative spatial analysis is used when decisions must be made on how to spatially configure a new design in order to maximize its utility or minimize its cost. The process of performing this analysis requires first generating a model from available data that reflects the conditions within the regional extent and relevant design factors. When designing projects to be located on natural landscapes, raster data is often the most appropriate; but as with any model, there are many considerations that must be made to ensure model accuracy. In this study, we have examined the errors and distortions associated with shortest path analysis when using raster representation of terrain, and how decisions made in the data preparation stage affected the results of the analysis.

First, we examined the effects of raster network connectivity on biobjective shortest path analysis, and found that the number of solutions, the spatial configurations of the routes, and the objective values of the Pareto-optimal set were all affected by the choice of the network connectivity used. While most popular GIS software packages only provide the ability to run analyses on R = 0 and R = 1 networks, it was found that R = 2 networks provided more alternatives, with less orientation error, and that their solutions consistently had lower objective costs than the R = 1 network paths. Given that continuing advances in computational power make shortest path analysis a trivial task for most common data, we unequivocally believe it is long overdue for major GIS software companies to add the built-in ability to generate R = 2 networks for spatial analysis.

Next, we examined the effects of reclassification to convert raw data features into cost surface rasters. We defined the dynamic range as being the ratio of the maximum cost divided by the minimum cost. We then ran the same analysis with the same relative interval breaks between different attribute costs, varying only the range of the attribute cost mappings. The purpose of this experiment was to demonstrate that selecting arbitrary cost ranges, such as from 1 to 5 or 9 or 10, will significantly impact the results. We found that lower dynamic range reclassifications resulted in fewer path solutions that tended toward straight-line Euclidean shortest paths, while higher dynamic range reclassifications tended toward more path solutions that emphasized avoiding high attribute costs more than geometric factors. If an analyst has a motivation or preference for a more Euclidean solution, they could shift the classification range to a lower dynamic range to achieve this while still giving the appearance of a truly objective analysis. It should always be emphasized in all methodologies and tutorials that in order to maintain complete objectivity, ratio-scaled reclassifications should always be used: via direct conversion for tangible costs such as construction costs, or via surveys of relevant parties to gauge the proportional attribute impacts. These considerations are completely missing in all but one of the cited tutorials, resulting in methodologies unrepresentative of real-world conditions. Constituents should also inquire about the methods used in such an analysis when presented with alternatives developed by an entity who may have their own motives.

The geographic world is infinitely complex, and no model can perfectly capture every nuance of the spatial features that will affect the outcome of a normative spatial analysis. Approximations must be made in order to represent the world in a manner that can be computed upon, and these approximations will always come with sources of error and distortions. But it is important to be aware of these representation errors, and to use the best practices outlined here to mitigate their effects on the analysis so that the model can best reflect accurate real-world criteria and result in objectively unbiased solutions.

Data Availability

https://doi.org/10.5281/zenodo.4540743 which is an archived snapshot of the following open source repository: https://github.com/antoniomedrano/pNISE.

Funding Statement

Argonne National Laboratories, grant number 1F-32422 The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Fotheringham S, Rogerson P. Spatial analysis and GIS: CRC Press; 2014. [Google Scholar]
  • 2.Monmonier M. How to Lie with Maps. 2nd ed: University of Chicago Press; 1996. [Google Scholar]
  • 3.Monmonier M. Lying with maps. Statistical Science. 2005;20(3):215–22. [Google Scholar]
  • 4.Huber DL, Church RL. Transmission Corridor Location Modeling. Journal of Transportation Engineering-Asce. 1985;111(2):114–30. [Google Scholar]
  • 5.Houston G, Johnson C. EPRI-GTC Overhead Electric Transmission Line Siting Methodology. Technical Report. 2006:198. [Google Scholar]
  • 6.Bagli S, Geneletti D, Orsi F. Routeing of power lines through least-cost path analysis and multicriteria evaluation to minimise environmental impacts. Environmental Impact Assessment Review. 2011;31(3):234–9. [Google Scholar]
  • 7.Mason T, Curry T, Wilson D. Capital Costs for Transmission and Substations. Black & Veatch prepared for WECC: 2012 Proj. No. 176322 Contract No.: 176322.
  • 8.Scaparra MP, Church RL, Medrano FA. Corridor location: the multi-gateway shortest path model. Journal of Geographical Systems. 2014;16(3):287–309. Epub 2 March 2014. [Google Scholar]
  • 9.Bachmann D, Bökler F, Kopec J, Popp K, Schwarze B, Weichert F. Multi-Objective Optimisation Based Planning of Power-Line Grid Expansions. ISPRS International Journal of Geo-Information. 2018;7(7):258. 10.3390/ijgi7070258 [DOI] [Google Scholar]
  • 10.Ekel PY, Lisboa A, Pereira J Jr, Vieira D, Silva L, D’Angelo M. Two-stage multicriteria georeferenced express analysis of new electric transmission line projects. International Journal of Electrical Power & Energy Systems. 2019;108:415–31. [Google Scholar]
  • 11.Eroğlu H, Aydin M. Optimization of electrical power transmission lines’’routing using AHP, fuzzy AHP, and GIS. Turkish Journal of Electrical Engineering and Computer Science. 2015;23(5):1418–30. [Google Scholar]
  • 12.Shandiz SG, Doluweera G, Rosehart W, Behjat L, Bergerson J. Investigation of different methods to generate Power Transmission Line routes. Electric Power Systems Research. 2018;165:110–9. [Google Scholar]
  • 13.Schito J, Wissen Hayek U, Raubal M, editors. Enhancing mulit criteria decision analysis for planning power transmission lines. 10th International Conference on Geographic Information Science (GIScience 2018); 2018: Schloss Dagstuhl, Leibniz-Zentrum für Informatik.
  • 14.Bruce B, Haneberg WC, Drazba MC. Using Qualitative Slope Hazard Maps and Quantitative Probabilistic Slope Stability Models to Constrain Least-Cost Pipeline Route Optimization. Offshore Technology Conference; 2013/5/6/; Houston, TX. OTC2003.
  • 15.Berry JK, King MD, Lopez C. A web-based application for identifying and evaluating alternative pipeline routes and corridor. GITA Oil and Gas Conference; Houston, TX2004.
  • 16.Iqbal M, Sattar F, Nawaz M, editors. Planning a Least Cost Gas Pipeline Route A GIS & SDSS Integration Approach. 2006 International Conference on Advances in Space Technologies; 2006 Sept. 2006.
  • 17.Atkinson DM, Deadman P, Dudycha D, Traynor S. Multi-criteria evaluation and least cost path analysis for an arctic all-weather road. Applied Geography. 2005;25(4):287–307. [Google Scholar]
  • 18.Pushak Y, Hare W, Lucet Y. Multiple-path selection for new highway alignments using discrete algorithms. European Journal of Operational Research. 2016;248(2):415–27. [Google Scholar]
  • 19.Urban D, Keitt T. Landscape connectivity: a graph-theoretic perspective. Ecology. 2001;82(5):1205–18. [Google Scholar]
  • 20.Adriaensen F, Chardon J, De Blust G, Swinnen E, Villalba S, Gulinck H, et al. The application of ‘least-cost’ modelling as a functional landscape model. Landscape and urban planning. 2003;64(4):233–47. [Google Scholar]
  • 21.Pascual-Hortal L, Saura S. Comparison and development of new graph-based landscape connectivity indices: towards the priorization of habitat patches and corridors for conservation. Landscape ecology. 2006;21(7):959–67. [Google Scholar]
  • 22.Rouget M, Cowling RM, Lombard AT, Knight AT, Kerley GI. Designing large-scale conservation corridors for pattern and process. Conservation Biology. 2006;20(2):549–61. 10.1111/j.1523-1739.2006.00297.x . [DOI] [PubMed] [Google Scholar]
  • 23.Matisziw TC, Gholamialam A, Trauth KM. Modeling habitat connectivity in support of multiobjective species movement: An application to amphibian habitat systems. PLOS Computational Biology. 2020;16(12):e1008540. 10.1371/journal.pcbi.1008540 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Gholamialam A, Matisziw TC. Modeling Bikeability of Urban Systems. Geographical Analysis. 2019;51(1):73–89. 10.1111/gean.12159. [DOI] [Google Scholar]
  • 25.Kimerling A, Buckley AR, Muehrcke PC, Muehrcke JO. Map Use: Reading Analysis Interpretation. 7th Edition ed. Redlands, California: Esri Press Academic; 2012. [Google Scholar]
  • 26.Daskin MS, Maass KL. The p-median problem. Location science: Springer; 2015. p. 21–45. [Google Scholar]
  • 27.Gaboardi JD, Folch DC, Horner MW. Connecting Points to Spatial Networks: Effects on Discrete Optimization Models. Geographical Analysis. 2019;0(0):1–24. 10.1111/gean.12211 [DOI] [Google Scholar]
  • 28.Miller HJ. Geographic representation in spatial analysis. Journal of geographical systems. 2000;2(1):55–60. [Google Scholar]
  • 29.Tong D, Murray AT. Spatial optimization in geography. Annals of the Association of American Geographers. 2012;102(6):1290–309. [Google Scholar]
  • 30.Shirabe T. A method for finding a least-cost wide path in raster space. International Journal of Geographical Information Science. 2016;30(8):1469–85. 10.1080/13658816.2015.1124435 [DOI] [Google Scholar]
  • 31.Seegmiller L, Shirabe T, Tomlin CD. A method for finding least-cost corridors with reduced distortion in raster space. International Journal of Geographical Information Science. 2020:1–22. 10.1080/13658816.2020.1850734 [DOI] [Google Scholar]
  • 32.Goodchild M. An evaluation of lattice solutions to the problem of corridor location. Environment and Planning A. 1977;9(7):727–38. [Google Scholar]
  • 33.Black WR. Transportation: a geographical analysis: Guilford Press; 2003. [Google Scholar]
  • 34.O’Sullivan D. Spatial network analysis. Handbook of regional science. Berlin Heidelberg: Springer; 2014. p. 1253–73. [Google Scholar]
  • 35.Antikainen H. Comparison of Different Strategies for Determining Raster-Based Least-Cost Paths with a Minimum Amount of Distortion. Transactions in GIS. 2013;17(1):96–108. [Google Scholar]
  • 36.Huber DL. Alternative methods in corridor routing: University of Tennessee, Knoxville; 1980. [Google Scholar]
  • 37.Hong I, Murray AT. Assessing Raster GIS Approximation for Euclidean Shortest Path Routing. Transactions in GIS. 2016;20(4):570–84. 10.1111/tgis.12160 [DOI] [Google Scholar]
  • 38.Schito J, Moncecchi D, Raubal M. Determining transmission line path alternatives using a valley-finding algorithm. Computers, Environment and Urban Systems. 2021;86:101571. 10.1016/j.compenvurbsys.2020.101571. [DOI] [Google Scholar]
  • 39.Barrientos R, Ponce C, Palacín C, Martín CA, Martín B, Alonso JC. Wire Marking Results in a Small but Significant Reduction in Avian Mortality at Power Lines: A BACI Designed Study. PLOS ONE. 2012;7(3):e32569. 10.1371/journal.pone.0032569 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Huang J, Tang T, Hu G, Zheng J, Wang Y, Wang Q, et al. Association between Exposure to Electromagnetic Fields from High Voltage Transmission Lines and Neurobehavioral Function in Children. PLOS ONE. 2013;8(7):e67284. 10.1371/journal.pone.0067284 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Loss SR, Will T, Marra PP. Refining Estimates of Bird Collision and Electrocution Mortality at Power Lines in the United States. PLOS ONE. 2014;9(7):e101565. 10.1371/journal.pone.0101565 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Liebman JC. Some Simple-Minded Observations on the Role of Optimization in Public Systems Decision-Making. Interfaces. 1976;6(4):102–8. [Google Scholar]
  • 43.Cohon JL, Church RL, Sheer DP. Generating multiobjective trade-offs: an algorithm for bicriterion problems. Water Resources Research. 1979;15(5):1001–10. [Google Scholar]
  • 44.Ehrgott M, Wiecek MM. Mutiobjective programming. In: Figueira J, Greco S, Ehrgott M, editors. Multiple criteria Decision Analysis: State of the art surveys. New York, NY: Springer Science+Business Media, Inc.; 2005. p. 667–708. [Google Scholar]
  • 45.Kuiper J, Ames DP, Koehler D, Lee R, Quinby T. Web-Based Mapping Applications for Solar Energy Project Planning. Idaho National Laboratory: Laboratory IN; 2013 INL/CON-13-28372 Contract No.: INL/CON-13-28372.
  • 46.Medrano FA. pNISE–a parallelized NISE algorithm [code repository]. 2021 [March 21, 2021]. https://github.com/antoniomedrano/pNISE, 10.5281/zenodo.4540743 [DOI]
  • 47.Jin S, Homer C, Yang L, Danielson P, Dewitz J, Li C, et al. Overall Methodology Design for the United States National Land Cover Database 2016 Products. Remote Sensing. 2019;11(24):2971. [Google Scholar]
  • 48.Yang L, Jin S, Danielson P, Homer C, Gass L, Bender SM, et al. A new generation of the United States National Land Cover Database: Requirements, research priorities, design, and implementation strategies. ISPRS journal of photogrammetry and remote sensing. 2018;146:108–23. [Google Scholar]
  • 49.Medrano FA, Church RL. A Parallel Computing Framework for Finding the Supported Solutions to a Biobjective Network Optimization Problem. Journal of Multi-Criteria Decision Analysis. 2015;22(5–6):244–59. 10.1002/mcda.1541 [DOI] [Google Scholar]
  • 50.Esri. Tools > Tool reference > Spatial Analyst toolbox > Distance toolset > Distance toolset concepts—Creating a cost surface raster: Esri; 2020 [December 26, 2020]. http://desktop.arcgis.com/en/arcmap/latest/tools/spatial-analyst-toolbox/creating-a-cost-surface-raster.htm.
  • 51.Esri. Cost-distance analysis workflow using ArcGIS Desktop—Lesson 1: Creating a cost surface: Esri; 2020 [December 26, 2020]. http://desktop.arcgis.com/en/analytics/case-studies/cost-lesson-1-desktop-creating-a-cost-surface.htm.
  • 52.Ghandehari Shandiz S, Doluweera G, Rosehart WD, Behjat L, Bergerson JA. Investigation of different methods to generate Power Transmission Line routes. Electric Power Systems Research. 2018;165:110–9. 10.1016/j.epsr.2018.08.012. [DOI] [Google Scholar]
  • 53.Saaty TL. The seven pillars of the analytic hierarchy process. Multiple Criteria Decision Making in the New Millennium: Springer; 2001. p. 15–37. [Google Scholar]

Decision Letter 0

Timothy C Matisziw

25 Jan 2021

PONE-D-20-40796

Effects of Raster Terrain Representation on GIS Shortest Path Analysis

PLOS ONE

Dear Dr. Medrano,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

The referees find this work well-written, interesting, and relevant. Based on my reading, I agree. The referees recommend some minor revisions be made prior to acceptance.  In particular, they note that the intro and motivation could use a little more finesse.  They also mention a variety of relatively simple changes that could be made to strengthen the presentation of the research and more recent research on the topic that should be considered.  I’ve also listed some comments below for you to take into consideration in your revision.

Other Editor Comments:

1. Data availability statement: It’s a requirement to make the data available through a permanent data repo such as figshare or Mendeley data and the dataset should have a permanent doi.  While you can create a doi to reference work in github, it’s a more difficult process.

2. There are no citations to research appearing in PLOS journals.  Some citations should be added to PLOS manuscripts to strengthen the relationship to the journal’s readership.

3. Abstract: It may be worth mentioning a few example applications that may be familiar to the readership of PLOS that could benefit from your research findings just to provide the readers with some context.  Also, ‘GIS’ is used in the abstract, but is not defined.  Those two items should be addressed in the intro and motivation as well.

4. The key words should be terms that are not used in the title and/or abstract.  The key words are supplemental search terms that make your article more discoverable.

5. The description of the R=2 criterion and its actual implementation are not clearly aligned.  While the R=2 is explained as a Queen’s move followed by a Knights’s move in the text, the figure shows straight lines between the raster cells.  This may be confusing to the journal readership.  I’m assuming the actual movement between a cell and it’s Queen + Knight counterpart is by way of a straight line and not one following the Queen+Knight movement (a longer path).  Also, isn’t it sufficient to term it a Knight’s move?  Further, it may be worth mentioning in the manuscript exactly how arc attributes were computed for the Knight’s move (e.g., was it a sum of the cells in the Knight’s move or the sum of the portions of the cells in the straight line connecting the two cells?).

Please submit your revised manuscript by Mar 11 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Timothy C. Matisziw, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2.We note that Figure(s) 3, 4, 7 and 8 in your submission contain map images which may be copyrighted. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For these reasons, we cannot publish previously copyrighted maps or satellite images created using proprietary data, such as Google software (Google Maps, Street View, and Earth). For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright.

We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission:

a)  You may seek permission from the original copyright holder of Figure(s) 3, 4, 7 and 8 to publish the content specifically under the CC BY 4.0 license. 

We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text:

“I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.”

Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission.

In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].”

b)   If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only.

The following resources for replacing copyrighted map figures may be helpful:

USGS National Map Viewer (public domain): http://viewer.nationalmap.gov/viewer/

The Gateway to Astronaut Photography of Earth (public domain): http://eol.jsc.nasa.gov/sseop/clickmap/

Maps at the CIA (public domain): https://www.cia.gov/library/publications/the-world-factbook/index.html and https://www.cia.gov/library/publications/cia-maps-publications/index.html

NASA Earth Observatory (public domain): http://earthobservatory.nasa.gov/

Landsat: http://landsat.visibleearth.nasa.gov/

USGS EROS (Earth Resources Observatory and Science (EROS) Center) (public domain): http://eros.usgs.gov/#

Natural Earth (public domain): http://www.naturalearthdata.com/

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: N/A

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The paper details representation and manipulation issues associated with the optimization of a corridor or path across space. In particular, the very much standard approach for identifying an optimal corridor/path is shown to be sensitive to potential movement in a GIS environment as well as attribute scaling. Further, the approach within the context of multi-objective optimization is shown to be further impacted.

In general, this is a fantastic topic, and of much contemporary concern. The paper is very good, but probably could be improved through further revision. Offered below are aspects of the presentation of the work that could be enhanced prior to publication:

1. The paper begins by suggesting that GIS is increasingly popular form of analytics, yet results can be sensitive to spatial representation. Then proceeds to outline the case of corridor/path analysis. Not sure this is the most effective motivation as corridor / path siting across continuous space is a fundamentally important problem on it own. Thus, either the transition to corridor analysis should be improved, or begin with corridor and make the general connection to GIS later in the section.

2. The corridor context is not one of generating a raster network, but rather that has been a common discretization approach taken to make the problem more manageable, particularly in a GIS environment. The discussion of the problem seems to suggest raster issues, but actually these are the byproduct of a selected abstraction process. This should be made more clear in the paper.

3. Similarly, multi-objective is but one approach taken, recognizing that many considerations go into corridor / path optimization. I think the MOSP introduction is a bit misleading, and sort of awkwardly brought into the paper.

4. Figure 1 seems familiar. This may well be an often used way to depict options for movement in a raster environment, but perhaps it needs a source citation.

5. Figure 2b showing only two non-dominated solutions makes it difficult to comparatively understand the a dominated solution. I realize that three paths are shown in Figure 2a, but perhaps another path should be added. I guess too that the different colors are make it hard to visualize and understand this.

6. Referencing in the paper is inconsistent. In some places, two authors are cited using an and, e.g., as Huber and Church, but in others only a comma is used, e.g., Seegmiller, Shirabe. This happens in many places, so should be cleaned up.

7. Minor editing. I saw a few things, but nothing major. These can be found by the authors in another reading of the paper, so I will not provide particular instances.

In summary, I like much of the paper, and think that the results are compelling. This should be published. However, I believe that the introduction and motivation could be enhanced through further revision.

Reviewer #2: This is an interesting article covering issues that arise when conducting shortest path analysis on raster terrains. While the article is well-written and has contribution to the field, I would like to see these changes in the manuscript.

1. In addition to network connectivity and range of attributes scale, the raster cell size is another factor that can impact both computation time and result of shortest path analysis. It would be interesting to add another dimension to current work by varying cell size of raster data and report how number of nondominated solutions, objective values, diversity of solutions and computation time change.

2. It would be good to add another table and describe 3 current networks (R1, R2 and R3) as well as new ones with different cell sizes (if any) as described in comment #1. Name and report number of nodes, arcs, cell size, … for each network.

3. There are new and closer works to the scope of the article and journal that can be used in the introduction section when describing multiobjective shortest path models. I have included two of them here:

“Matisziw TC, Gholamialam A, Trauth KM. Modeling habitat connectivity in support of multiobjective species movement: An application to amphibian habitat systems. PLoS Comput Biol. 2020; 16(12): e1008540. https://doi.org/10.1371/journal. pcbi.1008540.”

“Gholamialam A, Matisziw TC. Modeling bikeability of urban systems. Geogr Anal. 2019; 51(1):73–89.”

4. Provide legend for raster data in Fig. 3. Land use/land cover type (a), range of variables for slope (b) and costs (c and d).

5. In Fig. 4, use letters a, b and c for each panel (similar to Fig. 3) and use in the caption. Update R1, R2 and R3 with network names described in comment #2 if changes were made.

6. Use a, b, … letters for each panel in Fig. 7 and use them in the caption. Also, move range and number of solutions for each panel to the caption and refer to them by the letters.

7. It seems like 15 biobjective shortest path models have been experimented in this research. Three of which for different connectivity schemes (R1, R2 and R3), six models for fixed minimum range and six models for range shift. I highly recommend to report computation time for every single experiment. This can be worked into a new table describing range, computation time, network name and connectivity method (R1, R2 and R3). If possible, this table can be merged with the new table described in comment #2.

8. There are many ways to summarize objective values of multiobjective shortest paths. Some good examples can be found in two new references mentioned in comment #3. Try and summarize path objective values for each model and explain the variations among objectives and different models. For each experiment, the average, standard deviation, and range of objectives for each supported nondominated solution set can be easily reported.

9. Page 1, abstract, line 4. The word “on” is missing in the sentence. It should be: “Users should understand the impacts that data representations may have on their results in order to prevent distortions in their outcomes.”

10. Page 5, line 8 and 9. Replace decision space with objective space.

11. Page 8, line 5-9. How that three or four percent change has been measured. That should be first reported in a table, a figure or a graph so the reader knows how those numbers have been calculated.

12. Please consider numbering lines in your revised word document so reviewers can point to the text more easily.

13. Page 8, four listed approaches can be just explained in the text.

14. Table 1 seems like two tables with one caption. It should be merged or have separate captions.

15. Same comment for Table 2. See comment #14.

16. Page 11, last paragraph. The stopping criteria for [0,1] range should be clearly described instead of “whatever tie-breaking rules were used in the implementation.”

17. Once mentioned changes are made, some of the interesting quantitative variations for different raster representations should be mentioned/added to the abstract.

Reviewer #3: The manuscript studies the consequences of calculating shortest paths on raster data when two different aspects involved in the process are altered: network connectivity and range of the attribute scale to assign cost. The manuscript is very well written and easy to read. Comments follow.

- Pg. 7: please provide a short description of what the authors mean by “Qualitatively we compare..”. Even though, later it becomes clearer it is important to fully qualify what they are looking at.

- Pg. 7: the description of paths based on R=0, 1 and 2 seem obvious based on how the cells are connected to each other to form the network. Are there any other characteristics that the authors can provide besides this?

- Can the authors add a little more to the discussion section? For example, what happens with R=0 and R=1 networks? Yes, it is clear that the authors want to push for GIS software to include R=2 representations, but for completeness it would be important to see the differences on the paths based on the ranges on those networks too. Also, do the authors believe that a different algorithm for the shortest path calculation would yield different results?

- Please add a scale bar and north arrow to the maps.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Apr 15;16(4):e0250106. doi: 10.1371/journal.pone.0250106.r002

Author response to Decision Letter 0


22 Mar 2021

Dear PLOS ONE Referees and Editors,

Thank you for taking the time to review this article. Your comments were extensive, and enlightening to many items that needed clarification and correction in the first draft of this paper. In this revision I have addressed the reviewer comments, and in addition to the article submission I have provided a copy with annotated changes between the first draft and the current revision.

Editor Comments

1. I have created a DOI reference to a snapshot of the Github repository, which can be found here: https://doi.org/10.5281/zenodo.4540743, and included it in the article.

2. I have added 4 PLOS article references, all of which provide strong support to this article. PLOS journals contain an excellent body of literature, thank you for suggesting such a strong reference resource. [23], [39-41].

3. GIS is now defined in the abstract and introduction. I have also added one application in the abstract to provide a concrete example that the reader can connect with.

4. I have updated the keywords to enable better discoverability. Thank you.

5. Good point, I have revised the connectivity definitions to be more well-defined, and have added a reference for additional clarification.

Journal Requirements

1. I have done my best to follow PLOS ONE style requirements.

2. All maps in this article were created programmatically by myself, the author. The raw data came in the form of numerical matrices, then rendering was performed with Java and the Processing API. All data and code used to generate the maps are available in the open-source Github repository. As the author and owner of all images in this publication, I grant PLOS ONE full rights to publish the images in this article.

Reviewer 1 Response

Thank you for this review, you bring up some excellent points that were very important to address. My comments are as follows:

1. I agree that the article made an abrupt transition from general GIS analysis to talking about corridor location. I have addressed this and improved the transition from the former to the latter in the introduction.

2. Indeed, the article needed a word about how representing continuous space on a digital computer requires discretization of both space and attributes, and that raster is used for corridor location since spatial information often originates from remote sensing imagery. I have added this to the background.

3. Thank you, the transition to MOSP was indeed awkward, I have revised the section introducing multi-objective analysis to do so more smoothly.

4. While the general content of Figure 1 is common to many publications, the exact figure was created by the author from scratch using PowerPoint.

5. I have revised the representative figure to add more solutions in objective space. This, along with the later figures in the article, should clarify the relationship between decision space and objective space, and how MOSP problems have many solutions to consider.

6. & 7. I have cleaned up references and other punctuation. Thank you for pointing this out.

Reviewer 2 Response

You bring up many good points, and you clearly did a careful and methodical review. Thank you for the time you have invested, my comments are as follows:

1. Indeed, cell size can have a major impact on spatial analysis. This has been well studied in the literature before, references [36] and [37] specifically looked at the ramifications of cell size on shortest path analysis. I focus this article on aspects that are not already prevalent in the literature.

2. This paper is quite long as-is, but this is an excellent topic for a follow-up study. Thank you for this great idea!

3. These are excellent references, thank you for pointing them out. I have added them to this article [23] & [24].

4. All maps in Fig 3 were created using Java for this study, not with GIS software, so a graphical legend is not a built-in capability. Fig 3a though uses the same colors as the NLCD2016 class legend, and the other maps in the figure use light colors represent low slope or cost, and dark colors to represent high slope or cost according to the classifications listed in Table 1. I have provided a web link to the NLCD2016 class legend in the article, and added a reference from Fig 3 to Table 1 as well.

5. Fixed.

6. I have added letters to both Fig. 7 and Fig 8, and have updated figure captions accordingly. I believe the legends should remain within the panels for better visual association.

7. This paper is quite long as-is, so I focused on analytical results rather than computation times. Also, I have previously published another study focusing only on computation times using the same data and algorithms, you are invited to read that at [49].

8. Indeed, the referenced articles do a thorough quantitative analysis of the path results. Here the focus is primarily on the qualitative differences in the analysis, but will certainly use the same techniques from the referenced publications when we do a follow-up quantitative analysis.

9. Thank you, it’s now fixed.

10. Great catch, thank you!

11. That was a preliminary rough calculation, and without the associated thorough analysis should not be included. It has been removed. Thank you for pointing this out.

12. Done.

13. We tried explaining them in paragraph form, and it was hard and confusing to follow which was which. We believe it is much clearer to enumerate them in list format.

14. & 15. I have merged the tables, and am very happy with the result. Thank you for pointing this out.

16. This was poor wording on our part, this has been amended. Thank you.

17. We certainly will do so in our follow-up quantitative study.

Reviewer 3 Response

Thank you for your review comments. They are very well focused on important points, and my comments are as follows:

1. Thank you, this needed additional clarification. This has been amended.

2. Agreed, the analysis needed a conclusion to tie-together the observations. We have added this.

3. Indeed, we did run the analysis for all combinations of classification ranges and radius values. I have addressed this in the discussion now. For the sake of brevity, I do not include figures for all 33 combinations in the article, but I have added a folder to the Github repository containing image captures of all analysis results, and of course a reader could replicate the results with the code and data posted on Github.

Regarding different results with different shortest path algorithms, any exact shortest path algorithm should give the same result if there is only one optimal path for the given data. In the rare case that there is more than one optimal path, the objective values for all optimal paths would be the same, but the decision space route might vary with different algorithms. The only instance where this would not be a rarity would be if the lower bound on the classification is zero, which is a pathological, un-realistic scenario in real-world applications.

4. The figures were created programmatically with Java and the Processing API, not GIS software, and thus scale bars and north arrows are not built-in functionality. But scale and orientation are very important items to note, so I have added clarification to the Data section as it applies to all maps in this article. Thank you for this observation.

To all reviewers and editors:

Thank you again all for your reviews and comments, I firmly believe this article is much stronger now thanks to your contributions.

F. Antonio Medrano

Texas A&M University–Corpus Christi

Decision Letter 1

Timothy C Matisziw

31 Mar 2021

Effects of Raster Terrain Representation on GIS Shortest Path Analysis

PONE-D-20-40796R1

Dear Dr. Medrano,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Timothy C. Matisziw, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

All of the minor referee remarks appear to have been sufficiently addressed - Thank You!

Reviewers' comments:

Acceptance letter

Timothy C Matisziw

5 Apr 2021

PONE-D-20-40796R1

Effects of raster terrain representation on GIS shortest path analysis

Dear Dr. Medrano:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Timothy C. Matisziw

Academic Editor

PLOS ONE


Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES