Skip to main content
EPA Author Manuscripts logoLink to EPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Apr 15.
Published in final edited form as: ISPRS J Photogramm Remote Sens. 2018;146:151–160. doi: 10.1016/j.isprsjprs.2018.09.010

Accuracy assessment of NLCD 2011 impervious cover data for the Chesapeake Bay region, USA

J Wickham a,*, N Herold b, SV Stehman c, CG Homer d, G Xian e, P Claggett f
PMCID: PMC6463313  NIHMSID: NIHMS1013851  PMID: 30996518

Abstract

The National Land Cover Database (NLCD) contains three eras (2001, 2006, 2011) of percentage urban impervious cover (%IC) at the native pixel size (30 m-×−30 m) of the Landsat Thematic Mapper satellite. These data are potentially valuable to environmental managers and stakeholders because of the utility of %IC as an indicator of watershed and aquatic condition, but lack an accuracy assessment because of the absence of suitable reference data. Recently developed 1 m2 land cover data for the Chesapeake Bay region makes it possible to assess NLCD %IC accuracy for a 262,000 km2 region based on a census rather than a sample of reference data. We report agreement between the two %IC datasets for watersheds and the riparian zones within watersheds and four additional square units. The areas of the six assessment units were 40 ha cell, 433 ha (riparian mean), 2756 ha cell, 5626 ha cell, 8569 ha (watershed mean) and 22,500 ha cell. Mean Absolute Deviation (MAD) and Mean Deviation (MD) were about 1.5% and −1.5%, respectively, for each of the assessment units except for the riparian unit, for which MAD and MD were 0.88 and 0.62, respectively. NLCD reliably reproduced %IC from the 1 m2 data with a small, consistent tendency for underestimation. Results were sensitive to assessment unit choice. The results for the four largest assessment units had very similar regression parameters, R2 values, and bias patterns. Results for the riparian assessment were different from those for the watershed unit and the other three larger units. MAD was about 50% less for the riparian zones than it was for the watersheds, the direction of bias was less consistent, and NLCD %IC was uniformly higher than 1 m2 %IC in urbanized riparian zones. For the smallest unit, bias patterns were more similar to the riparian unit and regression results were more similar to the four larger units. MAD and MD were also sensitive to the amount of urbanization, increasing as NLCD %IC increased. The low overall bias and positive relationship between bias and urbanization suggest that the benefits of obtaining 1 m2 IC data outside of urban areas may not outweigh the costs of obtaining such data.

Keywords: MAUP, Mean Absolute Deviation (MAD), Mean Deviation (MD), Regression, Water quality

1. Introduction

Impervious cover (IC) is an environmental indicator that is used to establish policy (Brabec, 2009). The states of Connecticut (Bellucci, 2007) and Maine (Maine, 2012) use IC to help identify impaired waters as part of their reporting for the Clean Water Act (33 U.S.C. §1251 et. seq. (1972)). Perhaps first examined as an indicator of watershed and aquatic condition in the 1970s (Hammer, 1972), IC emerged as an important indicator two decades later (Arnold and Gibbons, 1996; Schueler, 1994), and is now widely used to assess watershed and aquatic condition (Brabec et al., 2002; Brabec, 2009; Schueler et al., 2009). Use of IC is widespread because many studies have shown adverse impacts on watershed and aquatic condition (storm flow volume, streambank erosion, biotic integrity, pollutant levels) even at very low levels of IC (e.g., Ourso and Frenzel, 2003; Schiff and Benoit, 2007; Stanfield and Kilgore, 2006). In addition, not unlike its role in assessment of watershed and aquatic condition, IC has also been recognized as an important component of the urban heat island (UHI) effect (Oke, 1982). IC contributes to several of the factors responsible for altered energy dynamics in urbanized areas, including anthropogenic heat, increased storage of sensible heat, and decreased evapotranspiration (Oke, 1995).

Measurement of percentage impervious cover (%IC) from a variety of remote platforms and using a variety of methods has been an active area of research (Slonecker et al., 2001; Weng, 2012) because it is measured more efficiently from the air than from field campaigns (Brabec et al., 2002; Wickham et al., 2014a, 2014b). The National Land Cover Database (NLCD) (www.mrlc.gov), a product of the Multi-Resolution Land Characteristics Consortium (www.epa.gov/mrlc), provides %IC, land cover, and tree canopy density database elements for the contiguous United States based on Landsat TM data (Homer et al. 2004). NLCD has been produced for the nominal years of 2001, 2006 (%IC and land cover only), and 2011 (www.mrlc.gov) (Fry et al., 2011; Homer et al., 2007, 2015). The NLCD %IC dataset classifies each 30 m-×−30 m Landsat TM pixel as 0% to 100% impervious cover in 1% increments based on modeled relationships to a sample of high resolution data (Yang et al., 2003).

Accuracy assessment is an important aspect of NLCD (Stehman et al., 2003, 2008; Wickham et al., 2017), but it has been focused on the land cover database element because fiscal and labor constraints necessitated prioritization of the numerous accuracy assessment objectives (Stehman et al., 2008) that arose from the database design (Homer et al., 2004) and temporal aspect of NLCD. Greenfield et al. (2009) and Nowak and Greenfield (2010) assessed the accuracy of NLCD 2001 %IC through photointerpretation of Google Earth™ images, and Wickham et al. (2013) reported cursory estimates of NLCD 2006 %IC accuracy as part of the land cover accuracy assessment for that NLCD era. There are no complementary estimates of %IC accuracy for NLCD 2011. The objective of this research is to document the agreement between NLCD 2011 %IC and the recently released high resolution (1 m2 pixels) land cover data for the Chesapeake Bay watershed (chesapeakeconservancy.org). The Chesapeake Bay data, which extend from New York State through Virginia, provide a suitable reference source (Olofsson et al., 2014) over a broad region for comparison to NLCD 2011.

Availability of census- rather than sample-based reference data for the entirety of the 262,000 km2 Chesapeake Bay watershed (http://www.chesapeakebay.net) creates the opportunity to assess NLCD 2011 %IC accuracy for each 30 m-×−30 m NLCD pixel by converting the Chesapeake Bay data to a binary format and summing over all 1 m2 units within each NLCD pixel. This simple approach, however, assumes that the two datasets can be registered to each other precisely. That is, each NLCD pixel is comprised of exactly 900 Chesapeake Bay (1 m2) pixels. Without requisite geometric precision, some portion of the disagreement will be attributable to misregistration (Dai and Khorram, 1998).

Larger assessment units can be used to mitigate the adverse impact of imperfect spatial registration (Stehman and Wickham, 2011). However, use of assessment units other than a NLCD pixel also necessitates inclusion of a secondary objective, which is to assess the sensitivity of agreement results to assessment unit characteristics. Particularly in geography, it has been recognized for a long time that statistical relationships are often dependent on assessment unit characteristics (Dark and Bram, 2007; Openshaw, 1977). This phenomenon, often referred to as the Modifiable Area Unit Problem (MAUP), has two main aspects: zonation and scale (Dark and Bram, 2007). Zonation refers to how the data are organized. For example, data could be organized by county or a square grid whose cells are the average size of the counties. Scale refers to size of the assessment unit (e.g., large number of small units versus a small number of large units). Because the reference data constitute a census rather than a sample, results can be tested for sensitivity to assessment unit characteristics. We envisioned that testing the sensitivity of agreement results to a range of sizes (see, for example, Jelinski and Wu, 1996) would be useful information for users of NLCD %IC products.

2. Methods

Land cover data with a spatial resolution of 1 m2 for the Chesapeake Bay region (chesapeakeconservancy.org) were compared to the 30 m×−30 m %IC data from the NLCD 2011 (www.mrlc.gov). The Chesapeake Bay data were mapped using imagery from the National Argriculture Imagery Program (NAIP) acquired during 2013, and other ancillary data that included LIDAR and orthophotography where available (chesapeakeconservancy.org). We used the Chesapeake Conservancy’s data for the entire region. These data included six classes: water, barren, trees and shrubs, herbaceous, impervious (other), and impervious (roads). User’s and producer’s accuracies for the Chesapeake Bay data exceeded 90% (chesapeakeconservancy.org), except user’s accuracies for trees and shrubs (87%) and herbaceous (79%). The %IC from the Chesapeake Bay data will hereafter be referred to as the “reference” classification. The modeled relationships used to develop %IC for NLCD 2001 (Yang et al., 2003) were updated to produce NLCD 2006 and 2011 using spectral-based change detection and Classification and Regression Tree (CART) modeling (Xian and Homer, 2010). Each NLCD era is based on the Landsat TM image acquisition date, which can vary somewhat because of cloud cover and other issues that affect data availability. The NLCD 2011 %IC data are nominally 2 years older than the 1 m2 data for the Chesapeake Bay.

Assessment of agreement between NLCD %IC and reference %IC data was not conducted on an NLCD pixel-by-pixel basis. The geometric precision of NLCD is ± 15 m (i.e., ± 1/2 Landsat TM pixel), whereas the geometric precision of the NAIP imagery use to develop the Chesapeake Bay (reference) data is ± 6 m (http://www.fsa.usda.gov). These precision estimates quantify the accuracy of geometric registration to their individual map bases. They do not guarantee that the two datasets will align to each other accurately. An assessment based on NLCD pixels as the spatial unit was not undertaken because of the difficulty of spatially registering the two datasets to each other so that the ground area covered by each NLCD pixel exactly matched the ground area covered by the 900 reference data pixels.

The red arrow and the locations labeled “1” in Fig. 1 illustrate the challenges related to spatial registration of the two datasets. The road intersection at the red arrow appears to be shifted about 45 m south and 15 m east in NLCD relative to the reference data, and thus comparison of %IC at the intersection would yield values of 0% for NLCD and perhaps 50% for the reference data. Shifting NLCD north and west would appear to align the two datasets at the road intersection, but also would appear to result in misregistration of roads at the locations labeled “1.”

Fig. 1.

Fig. 1.

Overlay of NLCD 2011 30 m2 %IC on Chesapeake Bay land cover in the vicinity of Deposit, NY. See methods for discussion (1 = road locations, 2 = road maintenance area, 3 & 4 = mines, 5 = undetected road in NLCD, red arrow = likely spatial misalignment at road intersection).

Differences in %IC between NLCD and the reference data illustrated in Fig. 1 may also be attributable to inconsistencies in the definitions of imperviousness, misclassification, differences in image acquisition dates, or a combination of such factors. The polygons located at “2”, “3”, “4”, and “5” reflect some of these differences. The polygon labeled “2,” based on inspection of Google Earth™ imagery, appears to be a 1 ha cinder storage area for winter road maintenance. The polygon does not appear to be a “sealed” surface, and therefore its classification as impervious in the reference data could be debated. In the NLCD land cover data, the 1 ha polygon is subsumed into a tract of cropland that is immediately north and west of it. Both datasets include a barren class, which would be the more correct classification. The polygons labeled “3” and “4” are mines, which also would be included in the barren class in both datasets. The mines are classified as impervious cover in the reference data. Polygon “3” is classified as barren by NLCD, but NLCD misclassifies polygon “4” as deciduous forest. Accuracy assessment of both data sets indicates confusion between barren and impervious cover (Pallai and Wesson, 2017; Wickham et al., 2017). The road labeled “5” was not detected in NLCD. Map and reference label differences at polygons “2” and “5” may also be attributable to differences in acquisition dates between the two datasets. Inspection of historical Google Earth™ imagery suggests that the road and the polygonal cinder storage area were constructed between 2009 and 2011. Since there was ± 1 year window for NLCD 2011 Landsat acquisition, it is possible that the NLCD Landsat data were acquired before the construction of the road and polygonal cinder area.

We used two primary assessment units, watersheds and riparian zones, because of their relevance to watershed and aquatic condition. The primary units were complemented by four additional square units of different sizes (Table 1) to more fully characterize the sensitivity of agreement results to the MAUP (Dark and Bram, 2007). We considered watersheds and riparian zones the focal assessment units because IC is an important indicator of watershed and aquatic condition. The states of Connecticut and Main use %IC to identify water bodies that do not meet water quality standards as established by the Clean Water Act (Bellucci, 2007; Maine, 2012), and the effect riparian zone IC on aquatic condition (as opposed to entire watershed IC) is an important area of research (Brabec et al., 2002; Brabec, 2009; Wickham et al., 2014a, 2014b, 2016). The 12-digit hydrologic units (nhd.usgs.gov/wbd.html) served as our watershed unit. They ranged in size from 178 ha to 24,504 ha (mean = 8569 ha). Riparian zones were defined as a 30 m radial buffer around water features (e.g., streams and lakes) within the watersheds. Water features were from the 1:100,000-scale National Hydrography Dataset (NHD) (www.horizon-systems.com). The riparian units ranged in size from 6 ha to 1289 ha (mean = 433 ha). The set of four square unit types were 39.6, 2756.25, 5625, and 22,500 ha, which correspond to NLCD 30 m-×−30 m pixel windows of 21-×−21, 175-×−175, 250-×−250, and 500-×−500, respectively. The six sets of assessment units increase in a log like progression from the smallest (40 ha) to the largest (22,500 ha), with the 2756 ha and 5625 ha units included to more fully articulate sensitivity to the MAUP (Dark and Bram, 2007). These two units were approximately one-third (2756 ha) and two-thirds (5625 ha) as large as the average size of a watershed.

Table 1.

Assessment unit characteristics.

Areas (ha)
Unit # NLCD pixels* Mean Min Max
22,500 ha cell 250,000 22,500
Watershed 8569 178 24,504
5625 ha cell 62,500 5625
2756 ha cell 30,625 2756
Riparian 433 6 1289
40 ha cell 441 40
*

Square root = window side length in NLCD pixels.

Prior to estimating %IC, the NLCD and reference datasets were aligned using the same geographic coordinate system, which was based on a raster version of the tessellation used for the largest assessment unit (22,500 ha). The alignment was done to ensure that each NLCD pixel included exactly 900 reference pixels in their entirety. The alignment was not intended to address geometric misregistration between the reference and NLCD datasets because it does not account for differences in the geometric relationships between the datasets and the map to which they were registered. Geometric registration would require knowledge of the precise location of a 1 m2 reference pixel within an NLCD pixel for several locations throughout the study area. Both data sets were also masked to exclude water so that %IC was based on the amount of land resolved in each dataset. Water masking was done to control for differences in the amount of land in each assessment unit that was attributable to differences in spatial resolution. The reference data had the potential to resolve many small water bodies (e.g., farm ponds) that were not likely to be resolved in the NLCD data. We chose to mask water because differences in water area could have been substantial when aggregated over an assessment unit.

Comparison of the NLCD and reference data was based on %IC derived for each dataset and each unit. NLCD %IC was computed as the %IC class multiplied by the area of that class, summed over all %IC classes from 1% to 100%, which was then divided by the area (land only) of the unit. The %IC for the reference data was a simple percentage since the 1 m2 spatial resolution permitted a binary classification (impervious or not). Percentage IC for the reference data was the area of the two impervious classes [impervious (other), impervious (roads)] divided by the land area assessment unit.

The per assessment unit difference between NLCD %IC and reference %IC (derived from the Chesapeake Bay 1 m2 data) provided the basis for assessing map-reference agreement. Agreement was quantified based on the differences (deviations) defined as: NLCD %IC – reference %IC. Mean Deviation (MD) and Mean Absolute Deviation (MAD) are the mean and mean of the absolute value of the deviations, respectively, computed for each of the six types of assessment units. The slopes and intercepts from ordinary least squares (OLS) regression complemented MAD and MD. OLS regressions were constructed to determine how well the coarser resolution NLCD %IC predicted the reference %IC by treating the reference %IC as the dependent variable and NLCD %IC as the independent variable. OLS slopes would be 1 and intercepts would be 0 if there was no bias (i.e., 1:1 trend line). Regression scatterplots provided informative visual assessments of the relationship between the two %IC datasets, and the regression coefficients, MAD and MD provided (together and separately) informative summaries of bias and useful information for displaying geographic patterns of bias.

3. Results

Agreement between reference %IC and NLCD %IC was not consistent across the six assessment units (Table 2), suggesting that the relationship between the two %IC datasets was sensitive to the MAUP (Dark and Bram, 2007). The results were consistent for the four largest assessment units (Table 1) with nearly identical MAD, MD, slopes, intercepts, and R2 values. The MAUP arose with the two smallest assessment units (40 ha cell and riparian zones). Agreement for these two sets of units were different from the four largest assessment units and different from each other. Recognizing that comparison of the riparian results to results from the other assessment unit sets constitutes a comparison of a subset of the study region to the entire study region, agreement statistics for the riparian unit were noticeably different from the other five assessment unit sets. For the riparian assessment unit, the R2 was noticeably lower (≈10%), but the MAD and MD values were also noticeably lower than the other assessment unit sets. Regression results for the smallest assessment unit (40 ha) were similar to the results for the four largest assessment units, but included more error and a smaller intercept. The larger error was attributable to the greater sensitivity to classification disagreement (Fig. S1A). The observations immediately adjacent and parallel to the abscissa and ordinate indicate absence of IC in one of the two datasets, which is a stronger indication of mismatched map labels than spatial misalignment. The likelihood of IC occurring in one dataset but not the other declined as the number of pixels in the assessment unit increased (Fig. S1B).

Table 2.

Regression results by assessment unit.

Unit # Obs R2 Slope Intercept MAD MD
22,500 ha cell 1125 0.96 1.08 1.28 1.50 −1.48
Watershed 2603 0.96 1.09 1.28 1.54 −1.51
5635 ha cell 4193 0.96 1.07 1.30 1.52 −1.48
2756 ha cell 8677 0.96 1.07 1.30 1.52 −1.48
Riparian 2602 0.86 0.81 0.92 0.88 −0.62
40ha cell 546,345 0.92 1.06 1.04 1.60 −1.45

There were noticeable differences in the NLCD-reference %IC relationship between the watershed and riparian units (Figs. 25). NLCD %IC was less than reference %IC for approximately 98% of the watershed assessment units, and the units in which NLCD %IC was greater than reference %IC were restricted to those units where NLCD %IC was at least 15% (Fig. 2). For the riparian assessment unit, there was a more equal mix of over- (12%) and underestimation (88%) of reference %IC by NLCD %IC, which shifted to a pattern of nearly uniform over-estimation when NLCD %IC reached about 15% (Fig. 3). The difference in NLCD-reference %IC relationship between the two assessment units produced an “inverse” spatial pattern in which reference %IC tended to be greater than NLCD %IC in urbanized watersheds (Fig. 4) and less than NLCD %IC in the riparian zone of those same urbanized watersheds (Fig. 5).

Fig. 2.

Fig. 2.

Regression (black) and 1:1 trend (gray) lines for reference %IC versus NLCD %IC for the watershed assessment unit. Data points above the 1:1 trend line identify underestimation of reference %IC by NLCD %IC.

Fig. 5.

Fig. 5.

Regional variation in impervious cover bias for the riparian assessment unit. Legend is based on the watershed in which the riparian unit is embedded.

Fig. 3.

Fig. 3.

Regression (black) and 1:1 trend (gray) lines for reference %IC versus NLCD %IC for the riparian assessment unit. Data points above the 1:1 trend line identify underestimation of reference %IC by NLCD %IC.

Fig. 4.

Fig. 4.

Regional variation in impervious cover bias for the watershed assessment unit.

The pattern of MAD and MD by NLCD %IC intervals supports the regression results and spatial patterns (Table 3). MAD increased as NLCD increased for all assessment unit sets. Differences between the four largest and two smallest assessment units were related to the magnitude of increase in MAD and differences between MAD and MD as urbanization increased. For the four largest assessment units, increases in MAD with increases in urbanization were modest and differences between MAD and MD indicated that the uniformity of NLCD’s %IC underestimation of reference %IC was disrupted only when NLCD %IC reached ≈15%. For the two smallest units, there were more substantial increases in MAD and a more equal mix of positive and negative bias across the NLCD intervals. The uniformity of positive bias when NLCD %IC reached 20% was unique to the riparian assessment unit. Overall, deviations were within ± 1.5 for 60% of the watersheds and 86% of the riparian zones.

Table 3.

MAD and MD values by assessment unit and NLCD 2011 %IC intervals. Lower ends of ranges are inclusive and higher ends are exclusive.

Unit NLCD 2011 %IC intervals
Overall 0–5% 5–10% 10–15% 15–20% 20–25% > 25%
22,500 ha cell
MAD 1.50 1.29 2.79 3.15 2.80 3.46 3.56
MD −1.48 −1.29 −2.79 −3.10 −2.68 −2.98 −1.91
# observations 1125 984 77 25 17 9 13
Watershed
MAD 1.54 1.32 2.56 3.39 3.32 3.41 3.79
MD −1.51 −1.32 −2.56 −3.39 −3.10 −2.83 −2.65
# observations 2603 2271 146 75 48 23 40
5635 ha cell
MAD 1.52 1.29 2.75 3.57 3.23 4.16 3.30
MD −1.48 −1.29 −2.71 −3.55 −2.99 −3.82 −1.98
# observations 4193 3679 236 100 56 42 80
2756 ha cell
MAD 1.52 1.28 2.89 3.30 3.92 4.13 3.73
MD −1.48 −1.27 −2.85 −3.25 −3.71 −3.37 −2.13
# observations 8677 7629 466 206 128 84 164
Riparian
MAD 0.88 0.81 1.14 1.90 3.16 4.75 7.60
MD −0.62 −0.75 0.26 1.36 2.23 4.75 7.60
# observations 2602 2405 129 43 16 4 5
40 ha cell
MAD 1.60 1.24 4.17 5.12 5.47 5.31 5.40
MD −1.45 −1.21 −3.76 −4.47 −4.59 −3.94 −2.35
# observations 546,345 492,565 17,849 8971 6437 4871 15,652

The divergence of agreement results between the two smallest and four largest assessment units was attributable to misclassification in both datasets (Wickham et al., 2017; Foody, 2002, 2014), spatial mis-alignment between the two IC datasets, and the challenges related to accurately capturing fine-grained detail that is typical of residential areas in urbanized and similar landscapes at a 30 m-×−30 m spatial resolution. Examples of these issues are provided as Supplemental Information. Misclassification of forest as impervious cover in the reference data contributed to a %IC difference of 7% for the riparian zone of a watershed in the Shenandoah Valley (Fig. S2). NLCD mis-classification of a mine as urban contributed to a %IC difference of 12% for a watershed riparian zone near York, PA (Fig. S3), and the same error (mine as urban) in the reference data contributed to a %IC difference of 5% for a watershed riparian zone near Somerset, PA (Fig. S4). The tendency toward overestimation of reference %IC by NCLD %IC for riparian zones in urban settings (Fig. 5) is likely attributable to the difficulty in detecting fine-grained land cover patterns at a 30 m×−30 m spatial resolution as well as spatial misregistration, both of which are more influential as assessment unit size decreases (Fig. S5).

4. Discussion

Ever-increasing computing power and the emergence of cost-free Landsat data (Wulder et al., 2012) have fostered many efforts to map land cover and related information at national to global scales (Fry et al., 2011; Hansen et al., 2013; Homer et al., 2015; Zhu et al., 2016). At the same time, there has been undeniable growth in the development of very high resolution (e.g., 1 m2) land cover data (chesapeakeconservancy.org; coast.noaa.gov/digitalcoast/data/ccaphighres.html; www.epa.gov/enviroatlas). It is unlikely, however, that 1 m2 land cover for the continental United States will be realized in the near term if for no other reason than the size of such a dataset would approach 1 trillion pixels (900 * ~8.9 billion Landsat TM pixels). In the near term, local- to regional-scale high resolution land cover provides much needed data for evaluation and improvement of Landsat-based per-pixel %IC classifications. In most cases, evaluations (e.g., accuracy assessment), such as we have undertaken here, will have to address issues related to geometric registration of datasets that are likely to have dramatically different spatial resolutions, misclassification in both datasets, and differences or inconsistencies in class definitions.

Because of the spatial resolution differences in the two datasets, we used assessment units other than a 30 m-×−30 m pixel to evaluate agreement to mitigate some of the impact that spatial misregistration may have on assessments conducted at the pixel level. Using a geographic unit other than a pixel had the potential to introduce the MAUP (Dark and Bram, 2007; Openshaw, 1977), and we chose a range of assessment unit sizes, akin to the approach used by Jelinski and Wu (1996), because the MAUP is a largely intractable problem (Dark and Bram, 2007). Agreement was sensitive to assessment unit size. Patterns of agreement were consistent across the four largest units, but the pattern of agreement for the two smallest units was not consistent with the other four. Deviations between NLCD %IC and reference %IC were more variable for the two smaller units for nearly the entire range of NLCD %IC. The greater variance in MAD and MD for the two smaller units was not surprising because local regions of %IC misclassification will not be compensated for as easily as would be the case in a larger assessment unit where aggregation effects may tend to smooth out errors. The pattern of agreement across the six assessment units suggests that confidence in map accuracy assessment results declined as the size (area) of the assessment unit declined.

It is well-established that reference data are not free of error (Foody, 2002; Khorram et al., 1999; Lunetta et al., 2001; Mann and Rothley, 2006), and many of the supplemental Figures (S1A, S2, S4) document that the reference data used here had misclassification errors. Because error-free reference data is an ideal rather than a reality (Khorram et al., 1999), the protocol for reference data has been higher quality rather than perfection (Khorram et al., 1999; Olofsson et al., 2014). The established higher quality protocol for reference data was met in this analysis through use of data with higher spatial resolution (Khorram et al., 1999) and therefore meets the recommended standard for accuracy assessment (Olofsson et al., 2014). What is perhaps different in this analysis is the census-based nature of the reference data permits a more precise spatial articulation of reference data misclassification that can be used to quantify and map localized regions where reference data error is the likely source of disagreement. Very small assessment units (e.g., Fig. S1A) are useful for identifying such regions.

As noted previously, statistical relationships can be affected by both the size and shape (zonation) of the unit used as the basis for constructing the relationship (Dark and Bram, 2007). We focused on the size aspect of the MAUP by using six assessment units that had different areal extents. We did not formally address the effect of assessment unit shape even though the shapes of the assessment unit were not uniform. The sinuous shape of the riparian unit may have had some effect. Despite being only 10% of the average area of the riparian unit, the similarity between regression results for the smallest unit (40 ha cell), and the other non-riparian units suggests that assessment unit shape could have an influenced the riparian results. It is possible, for example, that use of a riparian assessment unit based on larger watersheds (e.g., 10- or 8-digit hydrologic units) would have yielded results more similar to our riparian unit than our other units despite the increase in riparian unit area that would have been realized by using larger watersheds.

Our results were consistent with previous accuracy assessments of impervious cover (Greenfield et al., 2009; and Nowak and Greenfield, 2010; Powell et al., 2007). Powell et al. (2007) compared Landsat TM fractional maps of impervious and other cover classes constructed from spectral endmember analysis to higher resolution maps from aerial videography. They found that agreement (R2) increased as the size of the assessment unit increased from 1-×−1 to 17-×−17 Landsat TM pixels, and their results for the largest window size (17-×−17 pixels) were similar to our results for the 40 ha (21-×−21 pixels) assessment unit. In their valuable studies, Greenfield et al. (2009) and Nowak and Greenfield (2010) compared %IC derived from sample points photo-interpreted using Google Earth™ imagery to NLCD %IC for NLCD mapping zones (Homer and Gallant, 2001), counties, and Census designated places. NLCD %IC tended to underestimate %IC from the photointerpreted points, matching the predominant pattern we found. For the evaluation based on NLCD mapping zones (Nowak and Greenfield, 2010), there was an east-to-west decrease in the magnitude of bias, which may have been attributable to a decline in obstruction of below-canopy IC by tree cover. The obstruction of IC by tree canopies was noted as a confounding factor in the creation of the reference 1 m2 land cover data (Pallai and Wesson, 2017). We recognized the intuitive logic of omission of IC by overlying canopy and therefore regressed the residuals from the watershed map-reference %IC model against the proportion of the tree and shrub class from the reference data, but we did not find a significant relationship. Instead, our results indicated that the amount of IC was another confounding factor in addition to obstruction by tree canopy since we found that bias tended to be higher as urbanization increased.

Our results have important implications for the use of NLCD %IC for assessment of aquatic condition. NLCD %IC is most accurate across the range of impervious cover where adverse impacts on watershed and aquatic condition begin to appear. Watershed and aquatic condition can be adversely effected by IC at levels as low as 5% (Ourso and Frenzel, 2003; Schiff and Benoit, 2007; Stanfield and Kilgore, 2006), with restoration becoming impractical or ineffective as %IC exceeds 20% (Arnold and Gibbons, 1996; Schueler et al., 2009). Bias in NLCD %IC was minimal at low levels of IC where watershed and aquatic impairment may begin to appear. MAD values were ≤3.35% for the watershed and riparian assessment units when NLCD %IC was less than 20%. These results suggest that costs of obtaining 1 m2 IC data may outweigh the benefits in areas that are not urbanized.

The emergence of high resolution land cover maps from digital NAIP imagery is a genuine advance in remote sensing for a wide variety of applications (Popkin, 2018). NAIP-based high resolution land cover provided a valuable validation source for NLCD 2011 %IC that was previously unavailable. We documented spatial misregistration between the two datasets, and therefore chose to avoid use of NLCD pixels as the spatial support unit for the accuracy assessment. Despite the expected sensitivity to assessment unit size (Dark and Bram 2007), we found that mean deviations between NLCD %IC and %IC derived from 1 m2 land cover data were small for two assessment units (watersheds and riparian zones) commonly used by those concerned with the impact of impervious cover on watershed and aquatic conditions. The results are robust in the sense that the reference data covered a large area that comprised a wide variety of impervious cover densities and patterns, but do not support inference-based extrapolation to other locations because of the case study format necessitated by the geographic extent of the reference data.

Supplementary Material

Supp Info

Acknowledgements

The United States Environmental Protection Agency, through its Office of Research and Development, partly funded and managed the research described here. The article has been reviewed by the US EPA’s Office of Research and Development and approved for publication. We appreciate the insights of M. Mehaffey (US EPA) and anonymous reviewers on previous versions of the article. Approval does not signify that the contents reflect the views of the US EPA. S. Stehman’s participation was underwritten by contract G12AC20221 between SUNYESF and USGS.

Footnotes

Appendix A. Supplementary material

Supplementary data to this article can be found online at https://doi.org/10.1016/j.isprsjprs.2018.09.010.

References

  1. Arnold CLJ, Gibbons CJ, 1996. Impervious surface coverage: the emergence of a key environmental indicators. J. Am. Plan. Assoc 62, 243–258. [Google Scholar]
  2. Bellucci C, 2007. Stormwater and aquatic life: making the connection between impervious cover and aquatic life impairments for TMDL development in Connecticut Streams. In: Proceedings of the Water Environment Federation, TMDL 2007, DOI:1.2175/193864707786619819, pp. 1003–1018 (accessed 20 July 2017). [Google Scholar]
  3. Brabec EA, 2009. Imperviousness and land-use policy: toward an effective approach to watershed planning. J. Hydrol. Eng. 14, 425–433. [Google Scholar]
  4. Brabec EA, Schulte S, Richards PL, 2002. Impervious surfaces and water quality: a review of current literature and its implications for watershed planning. J. Plan. Literat 16, 499–514. [Google Scholar]
  5. Dai X, Khorram S, 1998. The effects of image misregistration on the accuracy of remotely sensed change detection. IEEE Trans. Geosci. Remote Sens 36, 1566–1577. [Google Scholar]
  6. Dark SJ, Bram D, 2007. The modifiable area unit problem (MAUP) in physical geography. Prog. Phys. Geogr 31, 471–479. [Google Scholar]
  7. Foody GM, 2014. Ground reference data error and the mis-estimation of the area of land cover change as a function of its abundance. Remote Sens. Lett 4, 783–792. [Google Scholar]
  8. Foody GM, 2002. Status of classification accuracy assessment. Remote Sens. Environ 80, 185–201. [Google Scholar]
  9. Fry JA, Xian G, Jin S, Dewitz JA, Homer CG, Yang L, Barnes CA, Herold ND, Wickham J, 2011. Completion of the 2006 National Land Cover Database for the conterminous United States. Photogramm. Eng. Remote Sens 77, 858–864. [Google Scholar]
  10. Greenfield EJ, Nowak DJ, Walton JT, 2009. Assessment of 2001 NLCD percent tree and impervious cover estimates. Photogramm. Eng. Remote Sens 75, 1279–1286. [Google Scholar]
  11. Hammer TR, 1972. Stream channel enlargement due to urbanization. Water Resour. Res 8, 1530–1540. [Google Scholar]
  12. Hansen MC, Potapov PV, Moore R, Hancher M, Turubanova SA, Tyukavina A, Thau D, Stehman SV, Goetz SJ, Loveland TR, Kommareddy A, Egorov A, Chini L, Justice CO, Townshend JRG, 2013. High-resolution global maps of 21st-century forest cover change. Science 342, 850–853. [DOI] [PubMed] [Google Scholar]
  13. Homer C, Dewitz J, Fry J, Coan M, Hossain N, Larson C, Herold N, McKerrow A, VanDriel N, Wickham J, 2007. Completion of the 2001 National Land Cover Database for the conterminous United States. Photogramm. Eng. Remote Sens 73, 337–341. [Google Scholar]
  14. Homer CG, Dewitz JA, Yang L, Jin S, Danielson P, Xian G, Coulston J, Herold ND, Wickham J, Megown K, 2015. Completion of the National Land Cover Database for the conterminous United States — representing a decade of land cover change information. Photogramm. Eng. Remote Sens 81, 345–354. [Google Scholar]
  15. Homer C, Gallant A, 2001. Partitioning the Conterminous United States into Mapping Zones for Landsat TM Land Cover Mapping. USGS White Paper; https://landcover.usgs.gov/pdf/homer.pdf (last accessed 13 September 2018). [Google Scholar]
  16. Homer C, Huang C, Yang L, Wylie B, Coan M, 2004. Development of a 2001 National Land Cover Database for the United States. Photogramm. Eng. Remote Sens 70, 829–840. [Google Scholar]
  17. Jelinski DE, Wu J, 1996. The modifiable area unit problem and implications for landscape ecology. Landscape Ecol 11, 129–140. [Google Scholar]
  18. Khorram S (Ed.), Biging GS, Chrisman NR, Colby DR, Congalton RG, Dobson JE, Ferguson RL, Goodchild MF, Jensen JR, Mace TH, 1999. Accuracy assessment of remote-sensing derived change detection American Society of Photogrammetry and Remote Sensing (ASPRS), Monograph Series, ISBN 1–57083058-4, Bethesda, MD, USA. [Google Scholar]
  19. Lunetta RS, Iiames J, Knight J, Congalton RG, Mace TH, 2001. An assessment of reference data variability using a “virtual field reference database”. Photogramm. Eng. Remote Sens 63, 707–715. [Google Scholar]
  20. Maine, 2012. Maine impervious cover total maximum daily load assessment (TMDL) for impaired streams. Maine Department of Environmental Protection, DEPLW-39, Augusta, Maine: http://www.main.gov/dep/water/monitoring/tmdl/2012/IC%20TMDL_Sept_2012_pdf. [Google Scholar]
  21. Mann S, Rothley KD, 2006. Sensitivity of Landsat/IKONOS accuracy comparison to errors in photointerpreted reference data and variations in test points. Int. J. Remote Sens 27, 25027–25036. [Google Scholar]
  22. Nowak DJ, Greenfield EJ, 2010. Evaluating the National Land Cover Database tree canopy and impervious cover estimates across the conterminous United States: a comparison with photo-interpreted estimates. Environ. Manage 46, 378–390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Oke TR, 1995. The heat island of the urban boundary layer: characteristics, causes and effects In: Cermak JE (Ed.), Wind Climate in Cities. Kluwer Academic Publishers, The Hague, Netherlands, pp. 81–107. [Google Scholar]
  24. Oke TR, 1982. The energetic basis of the urban heat island. Q. J. R. Meteorolog. Soc 108, 1–24. [Google Scholar]
  25. Olofsson P, Foody GM, Herold M, Stehman SV, Woodcock CE, Wulder MA, 2014. Good practices for estimating area and assessing accuracy of land change. Remote Sens. Environ 148, 42–57. [Google Scholar]
  26. Openshaw S, 1977. A geographical solution to scale and aggregation problems in region-building, partitioning, and spatial modelling. Trans. Inst. Brit. Geograph. 2, 459–472. [Google Scholar]
  27. Ourso RT, Frenzel SA, 2003. Identification of linear and threshold responses in streams along a gradient of urbanization in Anchorage, Alaska. Hydrobiologia 501, 117–131. [Google Scholar]
  28. Popkin G, 2018. US government considers charging for popular Earth-observing data. Nature 556, 417–418. [DOI] [PubMed] [Google Scholar]
  29. Powell RL, Roberts D, Dennison PE, Hess LL, 2007. Sub-pixel mapping of urban land cover using multiple endmember spectral mixture analysis: Manaus, Brazil. Remote Sens. Environ 106, 253–267. [Google Scholar]
  30. Pallai C, Wesson K, 2017. Chesapeake Bay Program partnership high-resolution land cover classification accuracy assessment methodology. accessed 27 JUN 2018 https://chesapeakeconservancy.org/wp-content/uploads/2017/01/Chesapeake_Conservancy_Accuracy_Assessment_Methodology.pdf.
  31. Schiff R, Benoit G, 2007. Effects of impervious cover at multiple spatial scales on coastal watersheds. J. Am. Water Resour. Assoc. (JAWRA) 43, 712–730. [Google Scholar]
  32. Schueler TR, Fraley-McNeal L, Cappiella K, 2009. Is impervious cover still important? Review of recent research. J. Hydrol. Eng 14, 309–315. [Google Scholar]
  33. Schueler TR, 1994. The importance of imperviousness. Watershed Protect. Techniq 1, 100–111. [Google Scholar]
  34. Slonecker ET, Jennings DB, Garofalo D, 2001. Remote sensing of impervious surfaces: a review. Remote Sens. Rev 20, 227–255. [Google Scholar]
  35. Stanfield LW, Kilgore BW, 2006. Effects of percent impervious cover on fish and benthos assemblages and instream habitats in Lake Ontario tributaries. Am. Fish. Soc. Symp 48, 577–599. [Google Scholar]
  36. Stehman SV, Wickham JD, Smith JH, Yang L, 2003. Thematic accuracy of the 1992 National Land-Cover Data (NLCD) for the eastern United States: statistical methodology and regional results. Remote Sens. Environ 86, 500–516. [Google Scholar]
  37. Stehman SV, Wickham J, Wade TG, Smith JH, 2008. Designing a multi-objective, multi-support accuracy assessment of the 2001 National Land Cover Data (NLCD 2001) of the United States. Photogramm. Eng. Remote Sens 74, 1561–1571. [Google Scholar]
  38. Stehman SV, Wickham J, 2011. Pixels, blocks of pixels, and polygons: choosing a spatial unit of thematic map accuracy. Remote Sens. Environ 115, 3044–3055. [Google Scholar]
  39. Yang L, Huang C, Homer CG, Wylie BK, Coan MJ, 2003. An approach for mapping large-area impervious surfaces: synergistic use of Landsat 7 ETM+ and high spatial resolution imagery. Can. J. Remote Sens 29, 230–240. [Google Scholar]
  40. Weng Q, 2012. Remote sensing of impervious surfaces in urban areas: requirements, methods, and trends. Remote Sens. Environ 117, 34–49. [Google Scholar]
  41. Wickham J, Homer C, Vogelmann J, McKerrow A, Mueller R, Herold N, Coulston J, 2014a. The Multi-Resolution Land Characteristics Consortium – 20 years of development and integration of USA national land cover data. Remote Sens. 6, 7424–7441. 10.3390/rs6087424. [DOI] [Google Scholar]
  42. Wickham J, Stehman SV, Gass L, Dewitz J, Fry JA, Wade TG, 2013. Accuracy assessment of NLCD 2006 land cover and impervious surface. Remote Sens. Environ 130, 294–304. [Google Scholar]
  43. Wickham J, Stehman SV, Gass L, Dewitz JA, Sorenson DG, Granneman BJ, Poss RV, Baer LA, 2017. Thematic accuracy assessment of the 2011 National Land Cover Database (NLCD). Remote Sens. Environ. 191, 328–341. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Wickham J, Wade TG, Norton DG, 2014b. Spatial patterns of impervious cover relative to stream location. Ecol. Ind 40, 109–116. [Google Scholar]
  45. Wickham J, Neale A, Mehaffey M, Jarnagin T, Norton D, 2016. Temporal trends in the spatial distribution of impervious cover relative to stream location. J. Am. Water Resour. Assoc. (JAWRA) 52, 409–419. [Google Scholar]
  46. Wulder MAA, Masek JG, Cohen WB, Loveland TR, Woodcock CE, 2012. Opening the archive: how free data has enabled the science and monitoring promise of Landsat. Remote Sens. Environ 122, 2–10. [Google Scholar]
  47. Xian G, Homer C, 2010. Updating the 2001 National Land Cover Database impervious surface products to 2006 using Landsat imagery change detection methods. Remote Sens. Environ 114, 1676–1686. [Google Scholar]
  48. Zhu Z, Gallant AL, Woodcock CE, Pengra B, Olofsson P, Loveland TR, Jin S, Dahal D, Yang L, Auch RF, 2016. Optimizing selection of training and auxiliary data for operational land cover classification for the LCMAP initiative. ISPRS J. Photogramm. Remote Sens 122, 206–221. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp Info

RESOURCES