Abstract
Accurately characterizing spatial patterns on landscapes is necessary to understand the processes that generate biodiversity, a problem that has applications in ecological theory, conservation planning, ecosystem restoration, and ecosystem management. However, the measurement of biodiversity patterns and the ecological and evolutionary processes that underlie those patterns is highly dependent on the study unit size, boundary placement, and number of observations. These issues, together known as the modifiable areal unit problem, are well known in geography. These factors limit the degree to which results from different metacommunity and macro-ecological studies can be compared to draw new inferences, and yet these types of comparisons are widespread in community ecology. Using aquatic community datasets, we demonstrate that spatial context drives analytical results when landscapes are sub-divided. Next, we present a framework for using resampling and neighborhood smoothing to standardize datasets to allow for inferential comparisons. We then provide examples for how addressing these issues enhances our ability to understand the processes shaping ecological communities at landscape scales and allows for informative meta-analytical synthesis. We conclude by calling for greater recognition of issues derived from the modifiable areal unit problem in community ecology, discuss implications of the problem for interpreting the existing literature, and identify tools and approaches for future research.
Introduction
We cannot understand why ecological communities are distributed as they are in space and time without first measuring how they are distributed in space and time. The sampling and measurement of ecological communities, particularly biological diversity, is a key methodological issue in community ecology (Magurran 2004, Rosenzweig 1995). While widely accepted standard methods exist for measuring diversity in a single location (Magurran 2004), hereafter referred to as local diversity, the appropriate methods for measuring regional diversity are controversial (Tuomisto 2010). In recent years a growing number of community and ecosystem ecologists have begun to examine patterns and infer processes occurring across multiple spatial scales of inference (Leibold, et al. 2004, Logue, et al. 2011). This new research has led to comparisons among studies (Anna, et al. 2012, Beisner Beatrix, et al. 2006, Cottenie 2005, Mazaris, et al. 2010). However, these comparisons are predicated on the assumption that we can make informative comparisons between different landscape scale ecological studies. Is that a fair assumption? Can we accurately infer the causes for differences between the results of landscape scale studies?
Comparisons of studies conducted at landscape scales to infer ecological processes often assume that ecological processes, such as differences in environmental drivers, species traits, or other intrinsic properties of the contrasting systems, cause the observed differences. However, the effects of two additional phenomena complicate cross study comparisons: the “zoning problem” and the “scale problem”, which collectively form the Modifiable Areal Unit Problem (MAUP) (Openshaw 1984). The basic premise of MAUP is that the specific pattern and scale of defining zones of analysis in a given area affects the variation in calculated data values and thus affects the conclusions the investigator will draw (Jelinski and Wu 1996). The intensity of the MAUP effect is unpredictable and can be particularly problematic when dealing with multivariate data such as community or environmental matrices (Fotheringham and Wong 1991). While the issue of scale has received considerable attention in ecology (Levin 1992, Mykrä, et al. 2007, Storch, et al. 2007), the zoning problem has received less attention (Dark and Bram 2007), and the interaction between the two has been largely ignored. However, studies at intermediate spatial scales typically incorporate a priori zoning decisions that can affect the interpretation of ecological data at landscape scales. For example, in Figure 1A, we show a hypothetical example of community turnover along an elevation gradient. Analysis based on the position of the boundary in Fig 1B would obscure a strong pattern in the data, while analysis based on the boundaries in Figure 1C would capture the same pattern. Although this example is simple, it illustrates that an underlying gradient not known to the investigator a priori can be obscured by the zonation pattern.
Ecologists have used many zonation strategies in landscape scale analyses. A common strategy is to group sites that share common geopolitical attributes. For example, fish β-diversity declines have been compared among states of the coterminous United States (Rahel 2000, Rahel 2002), the homogenization of plant communities has been studied using counties in California as groupings (Schwartz, et al. 2006). Dividing data by country, state, or county is often practical, particularly when aggregating multiple data sources that were collected by government agencies, but may not make ecological sense. In contrast, more recently investigators have defined spatial areas using common environmental characteristics or by dispersal barriers. For example, watersheds and catchments have been used as spatial units for aquatic organisms based on the rationale that dispersal among lotic habits within a watershed is greater than outside the watershed (Astorga, et al. 2014, Heino, et al. 2015).
In most cases multivariate metrics, the most sensitive metrics to the MAUP effect, are calculated for each zone to summarize ecological communities and environmental conditions. For example, β-diversity, mean α-richness, and γ-richness are often calculated for zonation units. Environmental data is also typically summarized within the spatial units. Standard metacommunity analyses designed to partition variation between local and regional processes such as principal coordinates of neighbor matrices (PCNM) and partial redundancy analysis of species data by environmental and PCNM variables are also often applied to these units (Borcard, et al. 2004, Legendre, et al. 2005). Derived statistics are then summarized to make comparisons among zones within a single analysis, or among many different analyses (Cottenie 2005). However, these types of comparisons may all be influenced by MAUP.
The estimation of environmental heterogeneity and fragmentation, mean connectivity, and the relative strength of niche and neutral processes are all examples of aggregated calculations that could vary based on the scale and arrangement of landscape zonation. However, is it necessary to zonate a landscape at all? If there are not discrete boundaries why impose them? Leibold et al. 2004 defines a metacommunity as a group of sites connected by the dispersal of multiple potentially interacting species. These relationships are often represented with a network diagram (Figure 2A); however, it would be unusual for all of the sites to be exchanging individuals with one another (Figure 2B). For a group of sites with incomplete pairwise connectivity, we can follow the original definition based on exchange of individuals and define a dispersal network for each site (Figure 2C-D). When viewed together the site-specific networks overlap and no discrete boundaries exist (Figure 2E), and instead, a continuously overlapping set of site-specific networks extends as far as the available habitat. Analytically this pattern can be observed by measuring regional metrics around focal locations using local neighborhoods defined by connectivity and a moving window (Luck and Wu 2002). This approach sidesteps the problem of determining the precise location of a metacommunity boundary by specifying an analysis that is flexible enough to identify unknown boundaries.
Use of a moving window, which we will hereafter refer to as smoothing, effectively deals with the zonation component of the MAUP (Openshaw 1984). Selection of grain size is inherent in the smoothing approach and thus the technique readily provides an approach for quantitatively examining the effects of scale without the confounding effect of zonation. To test the utility of smoothing to control for zonation effects and to allow for data integration we developed a neighborhood smoothing protocol for visualizing and analyzing community ecology patterns at macro-scales, which we call Macro-Ecological Spatial Smoothing (MESS). MESS is a flexible framework that is implemented in R, and can be applied to any large ecological dataset. Here, we present the MESS protocol and explore several applications using regional to continental scale aquatic taxa datasets.
We use stream fish data to display the MAUP effect on β-diversity patterns;
We use simulated data to quantitatively compare the performance of zonation approaches and MESS at capturing spatial patterns in processes driving community assembly;
We demonstrate the utility of MESS for exploring scale effects on biodiversity patterns and relationships;
We demonstrate the utility of MESS for standardizing the analysis of data from multiple sources (stream fish, stream invertebrates, lake invertebrates, zooplankton) to facilitate synthesis.
Methods
I. Macro-Ecological Spatial Smoothing
Macro-Ecological Spatial Smoothing (MESS) consists of sliding a moving window across a landscape and within each spatial window resampling and summarizing local observations (see Appendix S1 for an example R script of this protocol). To apply the MESS framework to a dataset the investigator selects a spatial grain (s) for the sampling regions (r), specifies the number (n) of random subsamples (rs) of local sites (ls) of a specified sample size (ss) to be drawn with replacement in each r for computing metrics, and specifies the minimum number (mn) of ls an r must contain to be included in the analysis. Investigator specified metrics (e.g. β-diversity, environmental heterogeneity, etc) calculated from each rs can be retained or averaged within each r. The purpose of using a uniform spatial grain (defined by s) and ls sample size for each rs (defined by ss) is to remove statistical artifacts when comparing metacommunities (Bennett and Gilbert 2016). The purpose for resampling (defined by rs) within each r is to minimize the influence of outlier observations and increase precision. The purpose for specifying a minimum number of local sites within each r is to ensure each r contains enough sites to support resampling.
For our case study analyses, we computed the following diversity metrics for each rs : -richness, defined as the mean number of taxa per ls in each rs, γ-richness, defined as the total number of taxa observed in each rs, and β-diversity, defined as the mean pairwise Bray-Curtis dissimilarity among the ls in each rs. We retained the average value of the metrics among the set of rs collected from each r. (see Appendix S1 for a detailed flow chart of an example analysis). Calculations of γ- and β-diversity were performed using the package Vegan in the R Statistical Environment (2013).
II. Testing the effectiveness of MESS
We applied zonation approaches and the MESS approach to fish community data collected by the Maryland Biological Stream Survey (Southerland, et al. 2005). MBSS fish sampling consists of a three-person team during the summer using a backpack electrofishing unit passed two times over a 75-meter reach with nets at both ends. Initial data screening identified 1035 stream samples collected by MBSS in wadeable 1st – 3rd order streams from 2007 to 2013 (73 – 180 per year). Of these, 613 samples with fish communities were retained for further analysis. We used three zonation approaches, grouping by county, physiographic province, and HUC12 watershed. To implement MESS, we used a simple Euclidian distance matrix and set the spatial grain (s) of each region (r) to 40km to capture ~90% of migratory fish movement. We chose to collect 10 (n) random samples (rs) of 20 (ss) local sites (ls). (see Appendix S2. for a detailed explanation of how we selected the parameters s, n, rs, and ss). In all cases α, β, and γ diversity were calculated from each rs.
To assess how well different approaches captured underlying diversity patterns we used three methods. First, we compared the statistical distributions of each diversity metric among approaches using ANOVA and the relationships between metrics (α~γ, β~γ) among methods using ANCOVA. Second, we created a map of community composition on the landscape (see method below) and compared the visual pattern of community uniqueness to the β-diversity patterns generated by the different methods. For the first two approaches we used the actual data collected in the surveys. For the third approach, we simulated the assembly of new communities on the landscape using a deterministic community assembly model (see method below) and compared how well each approach (MESS and 3 zonations) reproduced the relative importance of environmental drivers of community assembly. The visualization and simulation strategies are explained in detail below.
Visualization
We visualized community composition across the landscape by generating a representative color for each community based on its species composition and then mapping the colors onto the sampling locations (Baldeck, et al. 2013, Thessler, et al. 2005). To relate community composition to color, we first ordinated fish assemblage composition using a non-metric multidimensional scaling (NMDS) constrained to three axis (Oksanen, et al. 2014). We then assigned red, green, and blue colors to each of the three axes. Position along each NMDS axis was represented as the intensity of the assigned color, and the combination of the three colors corresponded to the site’s position in three-dimensional ordination space. We then colored each location on a landscape map according to the ordination position of the local community. Because differences in color directly correspond with differences in composition, regions of high and low β-diversity corresponded with regions with high and low color variation. We then compared the community composition color map with the different maps of β-diversity to qualitatively assess the effectiveness of the different zonation schemes in reproducing the visual color patterns.
Simulations
We constructed a scenario where the fish communities were assembled according to two environmental gradients, whose relative importance to community assembly changed spatially on a latitudinal gradient (see Appendix S3 for complete R script). After assembling communities on each landscape, we used variance partitioning implemented through the different grouping schemes (MESS, County, Physiographic Province, Watershed, Gridded) to estimate the coefficients of the relative importance of the two environmental gradients. The gridded scheme consisted of splitting the landscape into equal sized quadrants with the same grain (40km) as the MESS analysis to evaluate the value of the smoothing procedure in MESS independent of the uniform grain size procedure in MESS. The differences between the estimated coefficients and actual coefficients used to simulate the communities were then compared among grouping schemes for each set of simulations to assess goodness of fit.
For comparability with the visual mapping exercise above, we used the same landscape in the simulation models. We used the real sampling locations (sk) and then assigned each location a value on two environmental gradients (gik) ranging from 0 to 5000. For each simulation we generated normally distributed niches of 50 taxa using the Beta function with random parameterizations of niche width (random value from uniform distribution 10–5000), amplitude (random value from log normal distribution with µ = 2, SD = 1), and optimum conditions (random value from uniform distribution 0 −5000) (Fridley, et al. 2007, Minchin 1987). Normally distributed niches were created by holding niche shape parameters for the left side (a) and right side (b) at 1.99 following Fridley et al. (2007). Environmental gradient g1 values were assigned randomly to sampling locations while gradient g2 changed with longitudinal spatial location from 0 to 5000 across the region. The relative importance (m) of the two environmental gradients changed with the latitude of the site (Lsk) with g1 being more important in the south and g2 being more important in the north. Latitudinal change in m was determined by a sigmoid function (Eq 1) in which m, by definition, was equal to 0.5 in the center of the region (Lsk = 39.04˚N), and at this location, both g1 and g2 contribute equally to the species composition.
The steepness of the change in relative importance of the two environmental gradients around this central axis was modified by a shape coefficient, z. We ran 10 sets of simulations (n=100 per set) with different values of the steepness parameter z ranging from 0.5 (nearly linear change) to 9 (very rapid change). The number of individuals of each taxa in each site in each simulation was determined by first randomly selecting n individuals in each site according to g1 and g2, where probability of selection was weighted by taxa abundance in the location according to the simulated beta distributions (Fridley, et al. 2007). These two taxa abundance vectors (one for each gradient) were then combining as a weighted average determined by the relative importance (msk) of g1 and g2 in each site (sk).
To estimate the relative importance coefficient (m) for each site in the simulated data, we used partial RDA. Partial RDA is commonly used in metacommunity analyses to determine the relative importance of spatial and environmental factors driving community assembly (Legendre, et al. 2005). Here we used it to estimate relative importance (emsk) of the g1 and g2 in each region or zone. The partial RDA analysis was performed for each simulated landscape within each group defined by watershed, county, physiographic province, a regular 40km grid, and within regions inside the MESS workflow. The variation explained by the relationship between the estimate (emsk) and the known value (msk) was calculated for each grouping strategy for each simulation, and then values compared among groups across simulations using ANOVA. This purely environmental analysis is analytically comparable to the type of application often employed in metacommunity research, but easier to interpret and evaluate then a simulation model with dispersal where the dispersal parameter would not linearly correspond with the partial RDA output. A higher variance explained by one grouping strategy over another indicates that the grouping scheme is better at estimating multivariate spatial patterns from observed data.
III. Applications
A. Exploring Scale Effects
Application of MESS should allow investigators to examine scale-dependent relationships that are independent of zonation effects. To estimate the impact of spatial grain independent of zonation, we repeated the MESS application explained above for nine different spatial grains (20km - 100km in 10km intervals). The relationship between spatial grain and diversity measures was evaluated within and among locations. We also evaluated the effect of sampling method and sampling window on two commonly investigated diversity relationships: the relationship between -richness and γ-richness, and the relationship between β-diversity and γ-richness. The relationship between diversity metrics was assessed with a Partial-Mantel statistic calculated from a partial Pearson correlation between the diversity matrices and conditioned on a geographic distance matrix to correct for spatial autocorrelation across the landscape (Legendre and Legendre 1998). Calculations of γ- and β-diversity, and the multivariate dispersion tests for β-diversity were performed using the package vegan in the R Statistical Environment (2013).
B. Integrating datasets for common analysis
MAUP makes it difficult to compare the results of different analyses of spatial patterns conducted by different researchers. If observed relationships depend on scale and zonation, how can we synthesize these relationships to understand the processes that drive spatial patterns in biodiversity? MESS provides a simple solution to this issue. By analyzing different data sets with the MESS protocol, we can standardize diversity metric estimation. We demonstrate this application using two datasets from the EPA National Aquatic Resource Survey (NARS), the 2007 National Lakes Assessment and the 2009 National Rivers and Streams Assessment. The former dataset includes samples collected from lakes for zooplankton (n=1037) and for benthic, littoral invertebrates (n=924) distributed across the United States, whereas the latter dataset includes samples collected from wadeable lotic ecosystems for benthic invertebrate (n=1235) and fish (n=967) distributed across the United States. We applied the MESS framework to each of these four taxonomic groups using an s of 500km, an ss of 50, an n of 15, and an ms of 75. As above, we calculated the -richness, γ-richness, and β-diversity for each rs. We then demonstrate how diversity patterns among disparate taxonomic groups in different ecosystems can be compared with one another.
Results
I. Visual Comparison
The NMDS model had a non-metric fit of 0.981 and a stress of 0.137 indicating a reasonable fit of the data. Visual inspection of the community uniqueness map indicated clusters of similar stream communities along a North-East to South West across the upper portion of the region and a distinctly different community cluster along the eastern edge of the region (Fig 3). The remaining regions were more mixed with particularly high color variation among neighboring streams on the Western side of the region (Fig 3). The MESS output and the County grouping scheme both appeared to visually match the color patterns reasonably well with areas of high β-diversity (dark grey) corresponding to areas of high color variation (Figure 3: B & C). The physiographic province groupings were too large and obscured visual patterns of interest, whereas the watershed grouping identified a number of β-diversity hotspots that were not visually identifiable on the color map. The main deviation between the county map and the MESS output was an area where MESS output indicated that β-diversity was high. However, this area was split by a county border (inset box on Figure 3) which resulted in low β-diversity on either side. There were statistically significant differences among the β-diversity (P < 0.001, df = 3, F= 235.4) and γ-richness (P < 0.001, df = 3, F= 1244) distributions of the different groupings but not -richness (P < 0.926, df = 3, F= 0.155; Table 1). Significant interactions were observed between method and the relationship between β-diversity and γ-richness (P < 0.001, df = 1,3,3, F =17.03) and the relationship between – richness and γ-richness (P < 0.001, df = 1,3,3, F = 64.43).
Table 1.
- richness | β-diversity | γ-richness | Slope: β ~ γ | Slope: ~ γ | |
---|---|---|---|---|---|
Physiographic Region | 8.044 ± 2.257 | 0.801 ± 0.053 | 34.000 ± 4.243 | −0.0036 | 0.5442 |
County | 8.790 ± 2.713 | 0.732 ± 0.063 | 25.048 ± 7.399 | 0.0037 | 0.2303 |
HUC12 Watershed | 8.341 ± 3.680 | 0.710 ± 0.138 | 16.424 ± 6.992 | 0.0007 | 0.4225 |
MESS 40km | 8.953 ± 2.134 | 0.778 ± 0.055 | 25.064 ± 4.594 | −0.0010 | 0.3673 |
II. Simulated Data Evaluation
In each model scenario (gradual to abrupt change), the aggregate results from each of the grouping schemes (County, Watershed, Physiographic province, gridded, and MESS) all showed there was a change in the relative importance of the two environmental gradients moving from North to South (Figure 4). For the ANOVA model of the output, there were significant main effects of grouping scheme (df=4, F = 3427.48, P < 0.001) and the shape parameter z for the turnover gradient (df=9, F=12123.19, P < 0.001) on variance explained by the observed vs expected relationship, indicating that goodness of fit differed significantly among grouping schemes and turnover scenarios. There was also a significant interaction between grouping scheme and z (df=36, F=552.16, P < 0.001). In pairwise comparisons (Table 2), for the main effect-grouping scheme, the MESS output explained significantly more variance in the relative importance of the two environmental gradients (0.750 ± 0.278 SD) on average than any other grouping scheme (P < 0.001 in all cases). Within levels of z, MESS was not the best model at three levels (z = 0.5, 1, and 9). For z= 0.5 and 1 (slow change in the relative importance of the two environmental gradients), physiographic province more accurately accounted for the changes in m. At z = 9.0 (fast change in the relative importance), MESS and the gridded sampling scheme performed similarly.
Table. 2.
Grouping | z=0.5 | z = 1.0 | z = 2.0 | z = 3.0 | z = 4.0 | z = 5.0 | z = 6.0 | z = 7.0 | z = 8.0 | z = 9.0 |
---|---|---|---|---|---|---|---|---|---|---|
Mean R2 | ||||||||||
MESS 40km | 0.036 | 0.433 | 0.862 | 0.926 | 0.931 | 0.926 | 0.881 | 0.871 | 0.840 | 0.798 |
HUC12 Watershed | 0.024 | 0.056 | 0.498 | 0.715 | 0.701 | 0.720 | 0.685 | 0.650 | 0.660 | 0.584 |
Physiographic Region | 0.514 | 0.651 | 0.696 | 0.708 | 0.710 | 0.707 | 0.701 | 0.694 | 0.684 | 0.676 |
County | 0.027 | 0.229 | 0.712 | 0.780 | 0.770 | 0.741 | 0.709 | 0.695 | 0.648 | 0.630 |
40km Grid | 0.027 | 0.159 | 0.598 | 0.714 | 0.733 | 0.772 | 0.750 | 0.765 | 0.759 | 0.751 |
Standard Deviation R2 | ||||||||||
MESS 40km | 0.044 | 0.075 | 0.032 | 0.011 | 0.011 | 0.014 | 0.026 | 0.029 | 0.020 | 0.027 |
HUC12 Watershed | 0.036 | 0.035 | 0.072 | 0.064 | 0.077 | 0.061 | 0.075 | 0.058 | 0.072 | 0.060 |
Physiographic Region | 0.107 | 0.030 | 0.005 | 0.002 | 0.002 | 0.001 | 0.002 | 0.001 | 0.002 | 0.002 |
County | 0.017 | 0.108 | 0.059 | 0.039 | 0.043 | 0.037 | 0.041 | 0.032 | 0.051 | 0.039 |
40km Grid | 0.031 | 0.066 | 0.043 | 0.035 | 0.032 | 0.022 | 0.046 | 0.025 | 0.025 | 0.027 |
III. MESS Application
A. Exploring Scale Effects
The effect of s (grain size) on diversity varied among sites and among diversity measures (Figure 5A-C). γ-richness and β-diversity were affected by spatial grain, whereas -richness was not. The slope of the relationships between s and -richness ranged from negative to positive for different sites (0.0005 ± 0.02 SD, Figure 5D), whereas β-diversity (average: 0.002 ± 0.0009 SD) and γ-richness generally increased with spatial grain for all sites (average: 0.41 ± 0.23 SD, Figure 5 E & F). We observed a significant positive correlation between -richness and γ-richness at all values of s except 20 and 70, and the strength of correlation increased as s increased (Table 3). No significant correlation was observed between β and γ when s was less than 60km in size, however, a significant positive relationship was observed at 60km, 70km, and 100km (Table 3).
Table 3.
β ~ γ | α~γ | |||
---|---|---|---|---|
Spatial Scale (km) | Mantel Correlation | P-value | Mantel Correlation | P-value |
20 | −0.030 | 0.870 | −0.027 | 0.800 |
30 | −0.039 | 1.000 | 0.119 | 0.010 |
40 | −0.040 | 0.990 | 0.139 | 0.010 |
50 | 0.014 | 0.150 | 0.065 | 0.010 |
60 | 0.037 | 0.050 | 0.097 | 0.010 |
70 | 0.044 | 0.030 | 0.050 | 0.090 |
80 | 0.014 | 0.220 | 0.074 | 0.030 |
90 | 0.013 | 0.280 | 0.306 | 0.010 |
100 | 0.126 | 0.010 | 0.478 | 0.010 |
B. Integrating datasets for common analysis
We observed that , γ, and β-diversity were similar for benthic invertebrates in both streams and lakes. Benthic invertebrate diversity values were significantly higher in all diversity categories than observed for stream fish, whereas lake zooplankton had relatively low diversity (Table 4). Different relationships were observed between γ and β-diversity for each taxonomic group with negative correlations for benthic invertebrates and a positive relationship for stream fish (Table 4, Figure 6). When all the taxonomic groups are plotted together, a larger pattern emerges of increasing, saturating, and then declining β-diversity as γ-richness increases that is not apparent using other grouping approaches (Appendix S4). This cross-taxonomic pattern is also evident in the γ-richness ~ α-richness pairwise diversity relationship (Figure 6).
Table 4.
System | Taxa | - richness | β-diversity | γ-richness | Slope: α~γ | Slope: β ~ γ |
---|---|---|---|---|---|---|
Streams | Fish | 15.090 ± 3.351 | 0.713 ± 0.068 | 117.732 ± 43.624 | 0.0660 | 0.0012 |
Benthic Invertebrates | 37.681 ± 6.866 | 0.872 ± 0.019 | 330.098 ± 31.597 | 0.1978 | −0.0004 | |
Lakes | 41.345 ± 8.798 | 0.853 ± 0.029 | 249.740 ± 24.713 | 0.3300 | −0.0009 | |
Zooplankton | 12.909 ± 1.376 | 0.803 ± 0.024 | 62.981 ± 4.260 | 0.1689 | −0.0014 | |
Discussion
We highlight the problems that the Modifiable Area Unit Problem (MAUP) poses for metacommunity and macro-ecological analyses and demonstrate how using a version of spatial smoothing we developed for community data, Macro-Ecological Spatial Smoothing (MESS), provides a methodological solution. When measuring diversity at intermediate spatial scales, our measurements are often affected by boundary effects, variations in sample size, and the spatial scales of our measurements. These sources of variability have a strong potential to affect the interpretation of diversity analyses within datasets, and reduces our ability to compare results among different data sets. In concert, these sources of variation create unpredictable effects on the resulting diversity metrics. For example, we show that different zonation schemes of the same data produce different spatial representations of regional diversity. In contrast, the smoothing approach we present here, MESS, extracted diversity patterns that visually matched the actual observed patterns of community uniqueness. Furthermore, MESS generally outperformed the other zonation approaches at describing spatial variation in processes driving simulated community assembly. We also demonstrate that MESS is a powerful tool for exploring scale effects and for integrating datasets for common analysis.
Our visual comparison between the MESS output and the other methods is qualitative; however, the comparison offers a compelling visual argument for the problem that zonation presents in the analysis of ecological data and complements the analytical comparisons we provide using the simulation model. While it could be argued that each spatial representation is “right” for a particular question, our interpretation of the visual patterns of β-diversity on the color map (Figure 3) and how they compare to the colored community map is that the MESS output was the most useful for representing general diversity patterns across the landscape. Regions of the community uniqueness map that were particularly colorful were also regions where the MESS map exhibited particularly high β-diversity (Figure 3). The physiographic province map showed similar broad spatial patterns, indicating that physiographic province is one of the more important landscape attributes driving β-diversity in this region, but the level of detail in the patterns of β-diversity was coarser. On the other end of the range of spatial resolution, the watershed map did not accurately represent all of the landscape scale β- diversity patterns. The county map, which most closely matched the MESS output for much of the landscape, neatly demonstrates the problem with boundaries. In western Maryland, the region highlighted in Figure 3 exhibited high turnover on the MESS map and low turnover on the county map. A county boundary bisects the location, creating the discrepancy between the maps, and illustrates the issue we described in our conceptual description (Figure 1).
The simulation model provided quantitative support for the value of MESS. In 8 of the 10 sets of simulations, MESS provided better estimates of the relative importance (m) of the two environmental gradients than all other grouping strategies. The two scenarios in which MESS was somewhat less effective in predicting m were (1) the case of a slow change in the relative importance of the two environmental gradients and (2) the case of a fast change in the relative importance (Appendix S3: Fig 1). In the latter case, MESS and the gridded sampling approach perform comparably. In the former case, all of the sampling groups including MESS performed poorly with the exception of physiographic province. The odd shape of the study region, coupled with a slow change in relative importance likely explains the superiority of physiographic province in this one case. Physiographic provinces of the east coast of the United States are oriented in an east-west gradient and the western side of the state (entirely Highlands Province) extends into a narrow strip restricted to higher latitudes (Fig 3). Thus, the Highlands Province coincides with the upper end of the pre-determined latitudinal gradient and goodness-of-fit for the physiographic province group in that region is excellent, increasing the overall performance of that zonation pattern. This advantage only was apparent when the change in the relative importance of the two environmental gradients was slow (z = 0.5 and 1). Overall, the MESS approach is the most flexible tool and performs well over a range of conditions.
An alternate approach for using a moving-window for computing β-diversity is based on modeling the rate of distance decay in focal regions around individual points (McKnight et al. 2007). Our approach models β-diversity as a constant value within each moving window, which provides a simpler description than the distance-decay model. However, by examining the effects of different window sizes, we can extract many of the same insights as derived from the distance-decay model. Furthermore, our simple approach can be readily applied to α and γ-diversity, enabling comparisons among these three parameters using a common analytical approach.
One potential issue with the MESS approach is that the regional diversity estimates for closely neighboring focal locations spatially overlap, raising the potential problem of statistical non-independence. We primarily envision MESS as a tool for visualizing changes in diversity patterns across the landscape, and hence, formal statistical testing would not be required in the majority of its applications. In cases in which statistical tests are desired, existing methods for treating these types of results (e.g., permutation tests or sample size adjustments) should be applicable (F. Dormann, et al. 2007).
A second potential issue with the approach is dealing with ecologically valid boundaries such as dispersal barriers. Watersheds are an excellent example, where headwaters may be close to other headwaters as the crow flies but very distant in terms of river miles. In our initial case studies, we used Euclidian distance to define sampling regions, which has been shown in other studies to be an effective predictor of metacommunity dynamics and distance decay patterns in streams and rivers (Olli‐Matti, et al. 2015, Patrick and Swan 2011). However, alternative distance matrices such as river network distance or cost-weighted distance as experienced by the study organisms (e.g. ecological distance) could be substituted for the Euclidian distance matrix we used in the analysis (Sutherland, et al. 2015). Here we show that MESS informed by Euclidian distance outperforms alternative grouping methods, we expect that using more refined distance matrices in the MESS framework would further enhance these results.
Our observations provide a context for interpreting studies that compare regional diversity without correcting for the effects of scale. Studies that do not correct for scale when comparing regions implicitly assume that differences in region size do not substantially affect the comparison between regions. For example, catchments, watersheds, and political boundaries differ greatly in spatial extent and yet they are commonly used as units of observation (Astorga, et al. 2014, Brown 2011, Heino, et al. 2015). Others have added covariates to account for the effect of differing spatial extent (Heino, et al. 2015). Our results show that regional scale biodiversity metrics calculated for different spatial extents should be compared with caution, and our observation that the functional form of spatial effects depend on location suggests that accounting for these effects with covariates may be difficult.
The effect of spatial scale takes on additional importance when considering the form of basic theoretical relationships between α-richness, β-diversity, and γ-richness. These relationships have been explored within and among a variety of different ecosystems (Cottenie 2005, Crist, et al. 2003, Gering and Crist 2002, Kraft, et al. 2011, Martiny, et al. 2011, Soininen, et al. 2007). While observed differences have been attributed to factors such as organism demography, dispersal ability, or landscape fragmentation, we demonstrated that simple differences in spatial scale among datasets can also generate differences in regional diversity properties. In the stream fish dataset we show that as the spatial grain of the analysis increased, the relationship between γ-richness and β-diversity changed from a positive to a negative association. In the cross taxa analysis across the United States at a fixed grain size, we observed that a common, underlying curvilinear relationship existed among fish and invertebrates that changed from positive to negative as γ-richness increased. Studies identifying a positive or negative relationship between γ and β may do so only because they consider a part of the larger range of possibilities. The relationships we present here are likely driven by general ecological processes. For example, zooplankton, which are governed by substantially different dispersal dynamics than benthic invertebrates and fish (Louette and De Meester 2005), do not follow the β~γ pattern of fish and benthic invertebrates.
While it has long been known that biodiversity estimates are related to spatial grain (Arrhenius 1921, Cain 1938, Tuomisto 2010), our scale analysis highlights that the effect of spatial grain on diversity differs between diversity metrics and depends on location. We observed rates ranging from one new taxon added every kilometer added to the radius to one new taxon added every 20 kilometers added to the radius. Presumably, the higher rates occurred in areas that had higher habitat heterogeneity and species turnover. These types of phenomenon are not novel observations (Hanski and Gyllenberg 1997, Storch, et al. 2003), and they underlie the historical problem with defining the exact form of the species-area relationship (Storch, et al. 2007). However, these issues may present serious problems for interpreting β-diversity (Steinbauer, et al. 2012), particular for β-diversity metrics that are mathematically dependent on γ-richness (Jost 2007, Tuomisto 2010). Taken together, the analyses underscore a concept well known to community ecologists: diversity depends strongly on how things are sampled (Rosenzweig 1995) at the regional scale and at the local scale (Barton, et al. 2013, Steinbauer, et al. 2012). Sample size has a strong impact on the precision of estimates, and insufficient sampling will lead to noisy relationships. Spatial grain affects diversity measures and those effects differ among diversity measures and vary from location to location. Boundaries can create sharp changes in diversity where none exist in the underlying communities. Failing to control for variation in sampling intensity, spatial grain, and boundaries adds variation to estimates that cannot be accurately modeled with covariates. When these issues are not accounted for, comparisons among different datasets may not be informative, even when each individual dataset was well sampled. The same is true of relationships among regions within a single dataset and all multivariate measures calculated at regional scales.
Ecologists need to be aware of MAUP and consider it when drawing conclusions from regional scale analyses and comparing results from multiple studies (Dark and Bram 2007). The utility of MESS is that we can use it to control the zonation problem and minimize scale effects to make clean and robust datasets for comparative analysis, as we demonstrate with aquatic community data. The approach can be used with any diversity metric and can be applied to environmental variables and a range of other data types. While the smoothing approach we advocate here is useful, the conceptual issue of determining the appropriate scale of analysis will always be user defined. Levin (1992) contends that there is no correct scale for describing a system, but that scaling laws may allow for comparisons among studies conducted at different scales. However, when scaling rules vary spatially, as demonstrated in this analysis, then such comparisons using scaling laws are not possible. Our examples demonstrate that we can sidestep this problem. Our results show that despite being different ecosystems with different taxa, the stream and lake communities share common characteristics in the relationship between γ and β-diversity. Ultimately, standardizing and synthesizing community data may provide a way forward to identifying general laws in meta and macro-community ecology that are applicable across all ecosystems.
Supplementary Material
Acknowledgements
The views expressed in this paper are those of the authors and to not reflect the official policy of the U.S. Environmental Protection Agency.
Contributor Information
Christopher J. Patrick, Office of Water, Office of Science and Technology, Mail code 4304T, U.S. Environmental Protection Agency, Washington, DC 20460.
Lester L. Yuan, Office of Water, Office of Science and Technology, Mail code 4304T, U.S. Environmental Protection Agency, Washington, DC 20460
Work Cited
- Anna A, et al. 2012. Distance decay of similarity in freshwater communities: do macro‐ and microorganisms follow the same rules? - Global Ecology and Biogeography 21: 365–375. [Google Scholar]
- Arrhenius O. 1921. Species and area. - Journal of Ecology 9: 95–99 [Google Scholar]
- Astorga A, et al. 2014. Habitat heterogeneity drives the geographical distribution of beta diversity: the case of New Zealand stream invertebrates. - Ecology and Evolution 4: 2693–2702. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baldeck CA, et al. 2013. Soil resources and topography shape local tree community structure in tropical forests. - Proceedings of the Royal Society of London B: Biological Sciences 280. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barton PS, et al. 2013. The spatial scaling of beta diversity. - Global Ecology and Biogeography 22: 639–647. [Google Scholar]
- Beisner Beatrix E, et al. 2006. The role of environmental and spatial processes in structuring lake communities from bacteria to fish. - Ecology 87: 2985–2991. [DOI] [PubMed] [Google Scholar]
- Bennett JR and Gilbert B. 2016. Contrasting beta diversity among regions: how do classical and multivariate approaches compare? - Global Ecology and Biogeography 25: 368–377. [Google Scholar]
- Borcard D, et al. 2004. Dissecting the spatial structure of ecological data at multiple scales. - Ecology 85: 1826–1832. [Google Scholar]
- Brown BL, Swan CM, Auerbach DA, Grant EHC, Hitt NP, Maloney KO, and Patrick C. . 2011. Metacommunity theory as a multispecies, multiscale framework for studying the influence of river network structure on riverine communities and ecosystems. - Journal of the North American Benthological Society 30: 310–327. [Google Scholar]
- Cain SA 1938. The Species-Area Curve. - The American Midland Naturalist 19: 573–581. [Google Scholar]
- Chase JM 2007. Drought mediates the importance of stochastic community assembly. - Proceedings of the National Academy of Science 104: 17430–17434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cottenie K. 2005. Integrating environmental and spatial processes in ecological community dynamics. - Ecology Letters 8: 1175–1182. [DOI] [PubMed] [Google Scholar]
- Cottenie K. 2005. Integrating environmental and spatial processes in ecological community dynamics. - Ecology Letters 8: 1175–1182. [DOI] [PubMed] [Google Scholar]
- Crist TO, et al. 2003. Partitioning Species Diversity across Landscapes and Regions: A Hierarchical Analysis of Diversity. - The American Naturalist 162: 734–743. [DOI] [PubMed] [Google Scholar]
- Dark SJ and Bram D. 2007. The modifiable areal unit problem (MAUP) in physical geography. - Progress in Physical Geography 31: 471–479. [Google Scholar]
- F. Dormann C, et al. 2007. Methods to account for spatial autocorrelation in the analysis of species distributional data: a review. - Ecography 30: 609–628. [Google Scholar]
- Fotheringham AS and Wong DWS 1991. The Modifiable Areal Unit Problem in Multivariate Statistical Analysis. - Environ. Plan. A 23: 1025–1044. [Google Scholar]
- Fridley JD, et al. 2007. Co-occurrence based assessment of habitat generalists and specialists: a new approach for the measurement of niche width. - Journal of Ecology 95: 707–722. [Google Scholar]
- Gering JC and Crist TO 2002. The alpha–beta–regional relationship: providing new insights into local–regional patterns of species richness and scale dependence of diversity components. - Ecology Letters 5: 433–444. [Google Scholar]
- Grman E. and Brudvig LA 2014. Beta diversity among prairie restorations increases with species pool size, but not through enhanced species sorting. - Journal of Ecology 102: 1017–1024. [Google Scholar]
- Hanski I. and Gyllenberg M. 1997. Uniting two general patterns in the distribution of species. - Science 284: 334–336. [DOI] [PubMed] [Google Scholar]
- Heino J, et al. 2015. A comparative analysis reveals weak relationships between ecological factors and beta diversity of stream insect metacommunities at two spatial levels. - Ecology and Evolution 5: 1235–1248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jelinski DE and Wu J. 1996. The modifiable areal unit problem and implications for landscape ecology. - Landscape Ecology 11: 129–140. [Google Scholar]
- Jost L. 2007. Partitioning diversity into independent alpha and beta components. - Ecology 88: 2427–2439. [DOI] [PubMed] [Google Scholar]
- Kraft NJB, et al. 2011. Disentangling the drivers of beta diversity along latitudinal and elevational gradients. - Science 333: 1755–1758. [DOI] [PubMed] [Google Scholar]
- Legendre P, et al. 2005. Analyzing beta diversity: partitioning the spatial variation of community composition data. - Ecological Monographs 75: 435–450. [Google Scholar]
- Legendre P. and Legendre L. 1998. Numerical Ecology. 2nd Edition - Elsevier [Google Scholar]
- Leibold MA, et al. 2004. The metacommunity concept: a framework for multi-scale community ecology. - Ecology Letters 7: 601–613. [Google Scholar]
- Levin SA 1992. The problem of pattern and scale in ecology: the Robert H. MacArthur award lecture. - Ecology 73: 1943–1967. [Google Scholar]
- Logue JB, et al. 2011. Empirical approahes to metacommunities: a review and comparison with theory. - Trends in Ecology & Evolution 26: 482–491. [DOI] [PubMed] [Google Scholar]
- Louette G. and De Meester L. 2005. High dispersal capacity of cladoceran zooplankton in newly founded communities. - Ecology 86: 353–359. [Google Scholar]
- Luck M. and Wu J. 2002. A gradient analysis of urban landscape pattern: a case study from the Phoenix metropolitan region, Arizona, USA. - Landscape Ecology 17: 327–339. [Google Scholar]
- Magurran AE 2004. Measuring biodiversity. - Blackwell Publishing. [Google Scholar]
- Martiny JBH, et al. 2011. Drivers of bacterial β-diversity depend on spatial scale. - Proceedings of the National Academy of Sciences 108: 7850–7854. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mazaris A, et al. 2010. Biogeographical patterns of freshwater micro‐ and macroorganisms: a comparison between phytoplankton, zooplankton and fish in the eastern Mediterranean. - Journal of Biogeography 37: 1341–1351. [Google Scholar]
- McKinney ML and Lockwood JL 1999. Biotic homogenization: a few winners replacing many losers in the next mass extinction. - Trends in Ecology & Evolution 14: 450–453. [DOI] [PubMed] [Google Scholar]
- Minchin PR 1987. Simulation of multidimensional community patterns: toward a comprehensive model. - Vegetation 71: 145–156. [Google Scholar]
- Mykrä H, et al. 2007. Scale-related patterns in the spatial and environmental components of stream macroinvertebrate assemblage variation. - Global Ecology and Biogeography 16: 149–159. [Google Scholar]
- Oksanen J, et al. 2014. Vegan: community ecology package. Version 2.1–41. pp. http://r-forge.r-project.org/projects/vegan.
- Olli‐Matti K, et al. 2015. Inferring the effects of potential dispersal routes on the metacommunity structure of stream insects: as the crow flies, as the fish swims or as the fox runs? - Journal of Animal Ecology 84: 1342–1353. [DOI] [PubMed] [Google Scholar]
- Openshaw S. 1984. The modifiable areal unit problem. - Geo Books [Google Scholar]
- Patrick CJ and Swan CM 2011. Reconstructing the assembly of a stream-insect metacommunity. - Journal of the North American Benthological Society 30: 259–272. [Google Scholar]
- Rahel FJ 2000. Homogenization of fish faunas across the United States. - Science 288: 854–856. [DOI] [PubMed] [Google Scholar]
- Rahel FJ 2002. Homogenization of freshwater faunas. - Annual Review of Ecology and Systematics 33: 291–315. [Google Scholar]
- Rosenzweig ML 1995. Species diversity in space and time. - Cambridge University Press. [Google Scholar]
- Schwartz MW, et al. 2006. Biotic homogenization of the California flora in urban and urbanizing regions. - Biological Conservation 127: 282–291. [Google Scholar]
- Soininen J, et al. 2007. The distance decay of similarity in ecological communities. - Ecography 30: 3–12. [Google Scholar]
- Southerland MT, et al. 2005. New biological indicators to better assess the condition of Maryland streams. - Maryland Department of Natural Resources. [Google Scholar]
- Steinbauer MJ, et al. 2012. Current measures for distance decay in similarity of species composition are influenced by study extent and grain size. - Global Ecology and Biogeography 21: 1203–1212. [Google Scholar]
- Storch D, et al. 2007. Introduction: scaling biodiversity - what is the problem? Scaling Biodiversity; Cambridge. [Google Scholar]
- Storch D, et al. 2003. Geometry of the species-area relationship in central European birds: testing the mechanism. - Journal of Animal Ecology 72: 509–519. [Google Scholar]
- Sutherland C, et al. 2015. Modelling non‐Euclidean movement and landscape connectivity in highly structured ecological networks. - Methods in Ecology and Evolution 6: 169–177. [Google Scholar]
- Team, R. C. 2013. R: A language and environment for statistical computing. - R Foundation for Statistical Computing. [Google Scholar]
- Thessler S, et al. 2005. Mapping gradual landscape-scale floristic changes in Amazonian primary rain forests by combining ordination and remote sensing. - Global Ecology and Biogeography 14: 315–325. [Google Scholar]
- Tuomisto H. 2010. A diversity of beta diversities: straightening up a concept gone awry. Part 1. Defining beta diversity as a function of alpha and gamma diversity. - Ecography 33: 2–22. [Google Scholar]
- Veech JA and Crist TO 2010. Toward a unified view of diversity partitioning. - Ecology 91: 1988–1992. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.