Skip to main content
Heliyon logoLink to Heliyon
. 2024 Aug 23;10(17):e36660. doi: 10.1016/j.heliyon.2024.e36660

Surface water monitoring from 1984 to 2021 based on Landsat time-series images and Google Earth Engine

Bingxue Zhao a,b,, Lei Wang a,c
PMCID: PMC11388736  PMID: 39263062

Abstract

Dynamic monitoring of surface water bodies is essential for understanding global climate change and the impact of human activities on water resources. Satellite remote sensing is characterized by large-scale monitoring, timely updates, and simplicity, and it has become an important means of obtaining the distribution of surface water bodies. This study is based on a long time-series Landsat satellite images and the Google Earth Engine (GEE) platform, focusing on Anhui Province in China, and proposes a method for extracting surface water that combines water indices, Bias-Corrected Fuzzy Clustering Method (BCFCM), and OTSU threshold segmentation. The spatial distribution of surface water in Anhui Province was obtained from 1984 to 2021, and further analysis was conducted on the spatiotemporal characteristics of surface water in each city and three major river basins within the province. The results indicated that the overall accuracy of water extraction in this study was 94.06 %. Surface water in Anhui was most abundant in 1998 and least in 2001, with more distribution in the south than in the north. Northern Anhui is dominated by rivers, while southern Anhui has more lakes. Permanent surface water with an inundation frequency of above 75 % covered approximately 4341 km2, accounting for 32.03 % of the total water, while seasonal water with an inundation frequency between 5 % and 75 % covered about 6661 km2, accounting for 49.15 % of the total water, others were considered temporary surface water. By comparing our results with the global annual surface water released by the Joint Research Centre (JRC), we found that our study performed better in extracting lakes and rivers in terms of completeness, but the extraction results for aquaculture areas were slightly less than the JRC dataset. Overall, the long-term surface water dataset established in this study can effectively supplement the existing datasets and provide important references for regional water resource investigation, management, as well as flood monitoring.

Keywords: Surface water, Remote sensing, Time series images, GEE, Anhui Province

1. Introduction

Surface water is an essential component of water resources, including rivers, lakes, and reservoirs, playing a crucial role in the ecological environment, social economy, and human sustainable development [1,2]. In recent decades, the combined constraints of climatic conditions and human activities have led to a continuous decrease in available surface water and water pollution, which has become a major problem for human beings [3]. The distribution and area of surface water bodies vary significantly between years and seasons. Therefore, continuous monitoring of the spatial distribution of surface water, accurate analysis of its evolution process, and rapid understanding of regional water resource balance are of great significance. This provides insights into revealing the impacts of both natural factors and human activities on surface water, and contributes to water resources investigation and management, flood and drought monitoring, disaster assessment, agricultural production, comprehensive watershed management, sustainable utilization, and the protection of water resources [[4], [5], [6]].

With the development of spatial information technology, remote sensing has become one of the important means of extracting information from the terrestrial surface. Due to the distinct spectral characteristics of water bodies and their spatial distribution exhibiting strong continuity and homogeneity, it is possible to utilize remote sensing to obtain the distribution of large-scale water bodies. Compared with conventional field survey methods, satellite remote sensing is characterized by its speed and simplicity, particularly for large-scale and long-term time series studies [7,8]. The Google Earth Engine (GEE) is an online cloud platform used for image retrieval and processing, it contains a large publicly and readily available geospatial dataset [9]. Due to its simplicity and efficiency, GEE has been used in the analysis of land use cover change [10], urban change [11], wetland mapping [12], crop classification [13,14], and water extraction [[15], [16], [17], [18]].

Accurately obtaining the area of water bodies is an important part of the geographical census, as well as serving as basic work needed for studies of hydrology, ecology, agriculture, biodiversity conservation, and natural resource development. The basic principle of extracting water from remote sensing imagery is to enhance the spectral brightness of water bodies and suppress the brightness of non-water bodies such as vegetation and land [2,19,20]. In recent years, numerous scholars have conducted extensive research on identifying surface water bodies using remote sensing techniques and have proposed a series of algorithms and models for water extraction. Generally speaking, these methods can be roughly categorized into single-band threshold [21,22], multi-band water index [19,20,23,24], object-oriented classification [25], machine learning [26,27], and deep learning [28]. Among them, the single-band threshold method typically utilizes near-infrared and mid-infrared bands, which are characterized by simplicity and easy to operate, but there is more confusing information and relatively low extraction accuracy, especially for small or severely polluted water in complex backgrounds [29]. Compared with the single-band threshold method, the multi-band spectral indices have been widely used, which constructing mathematical models through band ratio operations to enhance water bodies and suppress non-water features. Commonly used water indices include the Normalized Difference Water Index (NDWI), the Modified Normalized Difference Water Index (MNDWI), the Automated Water Extraction Index (AWEI), the 2015 Water Index (WI2015), and the Multi-band Water Index (MBWI) [30]. Some studies have compared these water indices and showed that AWEI and WI2015 have significantly better extraction accuracy [23,24]. Object-oriented classification achieves the extraction of targets by establishing one or more rules, generally based on high-resolution images. Machine learning mainly includes methods such as support vector machine (SVM), random forest (RF), and decision tree classification. It requires selecting training samples or establishing classification rules to extract features, which can be time-consuming and in high demand for the interpreter. Deep learning has developed rapidly in recent years, including Deep Neural Network (DNN), Recursive Neural Network (RNN), and Convolutional Neural Network (CNN), which are particularly suitable for identifying features in large-scale areas [31]. However, it requires constructing large datasets to achieve high-precision extraction results. In addition, some scholars have constructed a new surface water extraction method of WIMFCF (Water Index Modified Fuzzy Clustering Method), which combines water body indices and the Modified Fuzzy Clustering (FCM) algorithm. After the processes of FCM, the background of remote sensing images is homogenized, and the spectral characteristics of surface water are enhanced, thereby improving the accuracy of water body extraction [2,32]. Threshold-based methods are the most common algorithm for water extraction, so that the water indices can be classified into water and non-water bodies with an appropriate threshold. However, since large-scale images are mostly captured at different periods, they may exhibit temporal and spatial variations, making the threshold susceptible to the subjectivity factors of the observer [33]. In previous studies, the method proposed by Otsu has commonly been used to automatically determine the threshold for grayscale images [34].

In recent years, several scholars have published datasets showing the distribution of global water bodies. For example, the Global Lakes and Wetlands Database (GLWD) is a global dataset of lakes and wetlands distributions based on historical maps with a spatial resolution of 30 arc seconds (∼1 km) [35]. The Global water has a resolution of 0.5 arc seconds and is based on Landsat and topographical data [1]. A global static surface water distribution dataset from 2000 to 2002 was produced by combining the Shuttle Radar Topology Missions (SRTM) and the 250 m Moderate Resolution Imaging Spectrometer (MODIS) satellite [36]. The global static surface water database G3WBM, has a spatial resolution of approximately 3 arc seconds (∼90 m) and can identify permanent water bodies within an area of 3.25 million km2 globally [37]. A global inland water dataset from 1999 to 2018 was constructed based on Landsat imagery [38]. Additionally, a global land cover dataset including water bodies at 30 m resolution for one year or multi-year was produced based on Landsat surface satellite data [39,40]. In recent years, with the improvement of available remote sensing images, a global land cover dataset for the years 2017–2021 was completed based on Sentinel-2 MSI images with 10 m resolution [41,42]. Among the datasets mentioned above, the Global Surface Water (GSW) dataset developed by the Joint Research Centre (JRC) of the European Commission from 1984 to 2018, has been most widely used [16].

As mentioned above, previous studies have adopted various methods for extracting water bodies and developed multiple global water body datasets. In addition to enhancing image classification accuracy, efforts have been made to improve classification efficiency. With the emergence of cloud-based platforms such as GEE, it has become possible to monitor water bodies on a large scale. Therefore, this study proposed a hybrid approach for surface water extraction that combines online and offline methods and is then applied in Anhui Province, China. The specific objectives were (1) to propose a framework for surface water extraction that combines water indices, the Bias-Corrected Fuzzy Clustering Method (BCFCM) algorithm, and OTSU threshold segmentation to improve the accuracy of water body extraction; (2) to analyze the spatiotemporal changes and inundation frequency of surface water in Anhui over 38 years using all available Landsat-5/7/8 imagery; and (3) to compare the advantages and limitations of this study results with the JRC dataset. Although the results obtained in this study pertain to a specific region, the extraction method can provide an important basis related to the documentation and analysis of the investigation and management of regional water resources, flood disaster monitoring, and agricultural irrigation production in other regions.

2. Study area and datasets

2.1. Study area

Anhui Province is located in eastern China, ranging from 29°41′–34°38′N and 114°54′–119°37′E. The total area of the province is 140,100 km2. It is situated in the lower reaches of the Yangtze River and the middle reaches of the Huai River, with the terrain of plains, hills, and mountains from north to south. The province lies in a transitional zone between the warm temperate and subtropical climates, characterized by distinct monsoons, with an average annual temperature of 14–17 °C. According to the water system, the province belongs to the three basins of the Yangtze River, Huai River, and Xin'an River basins (Fig. 1a–c), with more than 3000 smaller rivers and 580 lakes. Chao Lake is the largest lake in Anhui and the fifth-largest freshwater lake in China. The average annual precipitation in Anhui is 800–1800 mm, and the total amount of water resources is about 800 × 108 m3. The annual runoff varies greatly and is unevenly distributed throughout the year. Rainfall gradually increases from northwest to southeast and is also uneven from the north to the south, frequently leading to floods and droughts during monsoons.

Fig. 1.

Fig. 1

Location of the Anhui Province. (a) The general position of the study area in China. (b) The red-shaded area indicates the extent of Anhui Province. (c) Cities and general topography of Anhui Province.

2.2. Datasets

Considering the possibility of acquiring long time-series satellite images, we selected the Thematic Mapper (TM), Enhanced Thematic Mapper Plus (ETM+), and Operation Land Imager (OLI) of Landsat as the primary data sources, covering the period from 1984 to 2021 with a spatial resolution of 30 m. Note that the ETM + sensor experienced a malfunction after 2003, so the ETM + imagery was mainly used to provide auxiliary images when the number of images was insufficient. In addition, the available Landsat images were limited in certain years (such as 2012), therefore, we utilized HJ-1A/B Charge Coupled Device (CCD) images with the same 30 m resolution as a supplement. These images were obtained from the China Resource Satellite Application Center (https://data.cresda.cn/). Detailed parameters of the HJ-1 CCD are provided in Table S1. For the accuracy calculation of water body extraction, sentinel-2 Multispectral Instrument (MSI) images with a spatial resolution of 10 m and Google Earth high-resolution images were used to validate the results (Table 1). The HJ-1 CCD and Sentinel-2 MSI levels used in this study are L1C and L2A, respectively. To eliminate potential errors caused by atmospheric conditions and cloud cover, this study selected surface reflectance images with a cloud cover of less than 10 %. Finally, a total of 2366 Landsat TM/ETM+/OLI images and 4096 Sentinel-2 MSI images were retrieved through the GEE platform. Fig. 2a and b shows the number of rows and paths of Landsat satellite, as well as the average cloud cover of images each year. Due to the large number of images used in the study area, the workload of image correction and mosaic processing for each scene image separately in the ENVI software was relatively heavy. Therefore, we utilized the surface reflectance products of Landsat images in GEE [43], performed median filtering to generate cloud-free composite images, and then proceeded to calculate several commonly used water indices [44,45].

Table 1.

Satellite imagery used in the study.

Satellite image Period Spatial resolution Temporal resolution Data Sources
Landsat-1∼3 MSS 1984–2000 79 m 18 d Google Earth Engine (GEE) https://earthengine.google.com/
Landsat-5 TM 1999–2011 30 m 18 d
Landsat-7 ETM+ 1999–2013 15 + 30 m 16 d
Landsat-8 OLI 2013–2021 15 + 30 m 16 d
Sentinel-2 MSI 2015–2021 10 + 20+60 m 5 d
HJ-1A/B CCD 2008–2021 30 m 2 d China Centre for Resources Satellite Data and Application
https://data.cresda.cn/

Fig. 2.

Fig. 2

Spatial distribution of total number and average annual cloud cover of Landsat MSS/TM/ETM+/OLI images in Anhui Province from 1984 to 2021. (a) Number of Landsat images per row and path; (b) Number and average cloud cover of Landsat images in each year.

For the limited number of HJ-1 CCD images, a position deviation of about 3–4 pixels was visually observed compared with the Landsat images (Fig. S1). Therefore, we selected several Ground Control Points (GCPs) and conducted geometric correction on the HJ-1 CCD images using Landsat images as a reference. The geometric error of the HJ-1 CCD images was controlled within one pixel after correction. All images were projected to the Universal Transverse Mercator North 50th Zone with the WGS-84 datum after geometrical correction. In addition, the global surface water annual dataset from 1984 to 2021 developed by the Joint Research Center (JRC) was also selected to validate the extraction results.

3. Methodology

3.1. Extraction of surface water bodies

The rapid and accurate extraction of water bodies is the primary requirement for analyzing their spatial-temporal distribution. The most commonly used method for water body extraction is water indices. These indices are mathematical models constructed using ratio operations between different spectral bands. They enhance the contrast between water bodies and other land features, enabling the identification of water bodies (Fig. 3a–c). Several commonly used water indices calculation formulas are shown below. It should be noted that the AWEI can be classified into two types, Automated Water Extraction Index (AWEIsh) is primarily to remove shadow pixels, whereas Automated Water Extraction Index (AWEInsh) is designed for areas with urban background.

NDWI=ρGreenρNIRρGreen+ρNIR (1)
MNDWI=ρGreenρSWIR1ρGreen+ρSWIR1 (2)
MBWI=2×ρGreenρRedρNIRρSWIR (3)
AWEInsh=4×(ρGreenρSWIR1)(0.25×ρNIR+2.75×ρSWIR2) (4)
AWEIsh=ρBlue+2.5×ρGreen1.5×(ρNIR+ρSWIR1)0.25×ρSWIR2 (5)
WI2015=1.7204+171×ρGreen+3×ρRed70×ρNIR45×ρSWIR171×ρSWIR2 (6)

Where ρBlue, ρGreen, ρRed, ρNIR, ρSWIR1, ρSWIR2 are the surface reflectance of Landsat in the blue, green, red, near-infrared, and two short-wave infrared bands, respectively. Fig. 4a–f displays the calculation results of several water body indices, indicating that the above indices effectively highlight most of the water body information with relatively clear boundaries. Among them, the enhancement effects of some rivers and lakes (red and yellow ellipses) in the NDWI, MNDWI, and MBWI images were poor, while the enhancement effects of AWEIsh and WI2015 were better.

Fig. 3.

Fig. 3

Flowchart of the water extraction method in this study.

Fig. 4.

Fig. 4

Comparison of several commonly used water indices in the Yangtze River basin.

3.2. Fuzzy C-means clustering and threshold segmentation

The identification of water bodies has been enhanced and the background has been suppressed after calculating the water indices. For large and medium-sized lakes with clear boundaries and wide rivers, threshold segmentation and automatic extraction are relatively easy, while for the complex river networks in cities, small lakes with mixed edge pixels, and rivers in mountain terrain, direct threshold segmentation will inevitably produce errors. Therefore, based on the fuzzy clustering method [46,47], the study employed a BCFCM method to suppress the interference of background noise around water bodies (Fig. 5a and b), further improving the brightness values of water bodies [2,32]. The input layer for the BCFCM algorithm is the water index in TIFF format, and the parameters include the two peak values from the histogram of the water index, as well as the number of iterations to be executed. It can be seen that, the brightness values of non-water backgrounds in the images were suppressed after BCFCM, and the high brightness values were mostly water bodies (Fig. 5c and d). From the image histograms before and after processing, the overlapping areas in the histogram between the reflectivity of water and land are reduced, which reduces the confusion of reflectance between image pixels.

Fig. 5.

Fig. 5

Background homogenization images and histogram statistics. (a) AWEIsh images; (b) Histogram of the AWEIsh image; (c) BCFCM image; (d) Histogram of the BCFCM image.

The method proposed by Otsu is commonly used to determine the optimal threshold segmentation. It automatically determines the threshold and generates a binary image to obtain preliminary surface water bodies. For some of the narrow rivers, especially in the upstream areas, certain water bodies may be fragmented. To address this issue, mathematical morphology operators have been applied. These operators were initially developed and widely used in digital image processing and analysis [48]. As a result, the disconnected river lines can be reconnected. Finally, through visual interpretation, some missing small rivers were supplemented, and erroneous surface waters were removed to obtain the complete surface water results.

3.3. Accuracy evaluation

Accuracy evaluation is an important means of verifying the reliability of image extraction. The evaluation method typically compares the classification results with higher-resolution classification maps and establishes a confusion matrix to calculate the accuracy. In this study, Sentinel-2 MSI images were primarily selected to validate the accuracy of surface water extraction, a total of 692 MSI images with cloud cover below 5 % in Anhui in 2020. Additionally, higher-resolution Google Earth images were also selected to identify some areas that were difficult to classify, such as those with shadows or along water edges. After visually interpreting the MSI images, 4276 samples were identified as ground reference, including 2022 water samples and 2254 non-water samples. Among them, the water samples primarily included lakes, rivers, reservoirs, and urban rivers, while non-water samples mainly consisted of vegetation, buildings, bare land, and the shadow areas caused by mountains. The accuracy of the extraction results for several water indices was individually validated by establishing a confusion matrix and calculating overall classification accuracy, kappa coefficient, misclassification rate, and omission rate.

It is worth noting that in terms of sample type identification, the method proposed by Dehkordi et al. for automatically generating water/non-water samples improves the efficiency of sample classification [49]. In contrast, this study conducted manual visual interpretation based on Sentinel-2 and Google Earth images, resulting in more reliable results.

3.4. Frequency of surface water inundation

According to the above rules, all images are divided into two categories: water bodies and non-water bodies. Due to significant variations in the extent of water bodies between dry and abundant seasons, the Water Inundation Frequency (WIF) index was introduced to reflect the spatial distribution of surface water within and between years [50,51]. It calculates the frequency of water occurrence for each pixel and generates a frequency distribution map of water bodies in Anhui. Based on the principle of seasonal changes in water bodies, surface water with a frequency greater than 75 % is defined as permanent surface water, those with a frequency between 5 % and 75 % are categorized as seasonal water bodies, while water bodies with a frequency less than 5 % are classified as temporary surface water [52]:

WIF=i=1NwN×100% (7)

Where WIF corresponds to the water observation frequency, ranging from 0 to 100 %, N represents the total number of reliable observed data within a specific period, while w denotes the number of pixels corresponding to water bodies.

4. Results

4.1. Comparison of water extraction results

To validate the differences in water extraction for each water index, this study selected three typical regions in Anhui for comparison, namely the Huai River, Chao Lake, and the Yangtze River basin, located in the northern, central, and southern parts of Anhui, respectively. The Huai River is the longest in northern Anhui and serves as the boundary between northern and southern China. Chao Lake is the largest lake in Anhui Province and the fifth-largest freshwater lake in China. The Yangtze River is the longest in China.

According to the visual interpretation results, the extraction effects of MBWI, NDWI, and MNDWI were relatively poor; therefore, a preliminary comparison of the three types of water indices was conducted. As can be seen from Fig. 6, for large lakes with extensive water areas, these three water indices completely extracted the lakes. However, for thin linear rivers in the southwestern part of the Chao Lake basin and around small lakes, MBWI extracted the least water, while the MNDWI extracted the most (Fig. 6a–c red, yellow, and purple ellipses). For the Yangtze River basin in southern Anhui, the extraction results of different water indices varied widely, leading to incomplete extraction of rivers and medium-sized lakes (Fig. 6d–f red, purple, and yellow rectangles). MBWI extracted the least water, with significant omissions in some sections of the Yangtze River along the lake margins, while the MNDWI exhibited higher extraction accuracy compared to the MBWI and NDWI, and the identification of surface water was more complete and accurate. This may be due to the poor extraction results caused by suspended materials such as sediment entrained in the river, turbidity of the water body, and other inconspicuous spectral characteristics.

Fig. 6.

Fig. 6

Comparison of surface water extraction results for MBWI, NDWI, and MNDWI combined with BCFCM. (a–c) Chao Lake basin; (d–f) Yangtze River basin.

Secondly, a comparison was made among the three indices of AWEIsh, AWEInsh, and WI2015 As shown in Fig. 7, the extraction results of these three water indices were generally superior to the MBWI, NDWI, and MNDWI methods. Most of the extracted water bodies closely matched the actual water bodies, with extraction accuracy generally exceeding 90 %. Among the three water indices, AWEInsh has the highest omission rate and error rate for water bodies (yellow water bodies), while AWEIsh and WI2015 both have better water body extraction results, especially for the edges of lakes. In other words, AWEIsh and WI2015 can generally identify most water bodies, with a higher boundary recognition rate for smaller water bodies. Note that in Region 2 of the Yangtze River, the yellow area in Fig. 7 corresponds to Shengjin Lake, a national natural reserve in China surrounded by lush vegetation and numerous small water bodies mixed with waterweed, resulting in some omissions in the extraction of water bodies.

Fig. 7.

Fig. 7

Comparison of lakes and river networks with AWEIsh, AWEInsh, and WI2015 combined with BCFCM, where blue represents the water bodies and yellow indicates the misclassified water bodies. (a) The Yangtze River basin and its two local magnified areas; (b) Chao Lake and its two local magnified areas; (c) Huai River and its two local magnified areas.

To validate the accuracy of water extraction, this study first applied the BCFCM and OTSU threshold segmentation methods to the six water indices mentioned above to obtain surface water distribution maps respectively; secondly, classification samples were selected from the six surface water results, and validation samples were chosen based on Sentinel-2 MSI and Google Earth imagery; finally, the accuracy of water extraction was quantitatively evaluated using error matrices and kappa coefficients (Table 2). It can be seen that the NDWI had the lowest overall accuracy and kappa coefficient, followed by MBWI and AWEInsh, with high omission errors for these three indices. The extraction results of MNDWI were slightly better than the previous three, with an overall accuracy of 90.26 %; most of the mis-extracted water bodies were small and medium-sized rivers and small lakes. The extraction results of the AWEIsh and WI2015 indices were relatively complete, with both indices achieving an overall accuracy of over 92 %, and WI2015 exhibiting the highest accuracy and kappa coefficient. By combining these results with visual interpretation results, the study selected WI2015 for surface water extraction.

Table 2.

Accuracy evaluation of surface water extraction results combined water indices and BCFCM.

Index Prod.Acc
User.Acc
Commission
Omission
Overall Accuracy Kappa
(Percent) (Percent) (Percent) (Percent)
MBWI + BCFCM 75.96 98.64 1.36 24.04 86.08 % 0.72
NDWI + BCFCM 60.60 98.07 1.93 39.40 77.49 % 0.56
MNDWI + BCFCM 86.33 91.35 8.65 13.67 90.26 % 0.80
AWEInsh + BCFCM 69.62 99.51 0.49 30.38 82.96 % 0.67
AWEIsh + BCFCM 92.27 94.91 5.09 7.73 92.97 % 0.86
WI2015+BCFCM 91.19 97.97 2.03 8.81 94.06 % 0.88

4.2. Annual analysis of surface water bodies

Based on the above water body extraction methods, the spatial distribution of surface water in Anhui Province was obtained from 1984 to 2021 (Fig. 8a–i). The results indicated that surface water in Anhui was most abundant in 1998 and least in 2001 (Fig. S2), with more distribution in the south than the north. In addition, we converted the output binarized raster data into vector layers, calculated the number of lakes and rivers, and classified the lakes according to their area. For the calculation of rivers, we converted polygonal rivers into linear rivers and then calculated the length and density of the rivers. A statistical analysis of the area and number of lakes as well as the length and density of rivers in each city of the province in 2021 was conducted (Fig. 9a and b). Here, according to the number of lakes in the study area, we defined large, medium, and small lakes as those greater than 100 km2, between 10 and 100 km2, and between 10 and 1 km2, respectively; lakes smaller than 1 km2 were not counted. The results indicate that the surface water in Anhui shows obvious spatial heterogeneity, with more lakes in the south and fewer in the north. In terms of surface water types, the northern Anhui is dominated by rivers, only 18 lakes greater than 10 km2. The river density in northern Anhui is 0.2 km/km2, with the largest river network in Fuyang, followed by Bozhou and Suzhou. Besides the Chao Lake, which is the largest lake in the province with an area of 766 km2, the surface water in central Anhui is dominated by lakes, with 209 lakes larger than 1 km2. In terms of the number of lakes, Chuzhou City has the largest number of lakes, with 75 lakes larger than 1 km2. Regarding the count of large lakes, Anqing city has four large-sized lakes and eleven medium-sized lakes, along with numerous rivers of varying widths. As for the length of rivers, Hefei City has the longest river length reaching 1900 km, followed by Liu'an and Anqing, with lengths of 1663 km and 1393 km, respectively. The average river density in central Anhui is 0.09 km/km2. Southern Anhui contains a large number of medium to large-size lakes, thus the area of surface water was also large in the south. According to statistical analysis, Anhui has 385 small lakes and 69 medium-sized lakes, with 75 and 19 respectively distributed in the southern region. In terms of the river network density, the prefecture-level cities of Ma'anshan and Wuhu have the highest river network densities, reaching 0.42 km/km2 and 0.31 km/km2 respectively. Regarding the spatial distribution of surface water in each city, it can be observed that Hefei has the largest surface water, largely due to the presence of Chao Lake, followed by Anqing with a surface water area of 1160 km2. The three cities with the smallest surface water areas are all located in northern Anhui, namely Huaibei, Bozhou, and Suzhou.

Fig. 8.

Fig. 8

Spatial distribution of surface water bodies in Anhui Province from 1984 to 2021.

Fig. 9.

Fig. 9

Statistics on the number of lakes and river network density in each city of Anhui Province in 2021. (a) Area and number of lakes, (b) Length and density of rivers.

4.3. Analysis of surface water inundation frequency

Fig. 10 displays the frequency of surface water in Anhui Province from 1984 to 2021, where the inundation frequency of each pixel varies greatly over time. Permanent surface water (WIF ≥75 %) often occurred in large rivers, lakes, and reservoirs such as Chao Lake, the Yangtze and Huai Rivers, as well as medium and large-sized lakes, covering an area of 4341 km2, accounting for 32.06 % of the total water area. Seasonal surface water (5 % ≤ WIF <75 %) was primarily distributed around the small and medium-sized lakes, as well as on both sides of large and medium-sized rivers, with an area of 6661 km2, accounting for 49.15 % of the total water area. The temporary surface water area (WIF <5 %) was relatively small, covering only 2551 km2 and mainly concentrated in the areas covered by water over several years.

Fig. 10.

Fig. 10

Frequency distribution of surface water in Anhui Province using Landsat images from 1984 to 2021. (b) The Huai River Basin, (c) the Chao Lake, and (d) the Yangtze Basin.

5. Discussion

5.1. Comparison of our results to the JRC dataset

Numerous global or regional surface water datasets have been constructed by domestic and international scholars. Among them, the global yearly historical data (v1.4) dataset released by the Joint Research Centre (JRC) has been widely utilized, this database is based on Landsat 5, 7, and 8 images, classifying each pixel as water/non-water, and generating a total of 454 images from March 1984 to December 2020. The study analyzed by selecting several typical regions, demonstrating the differences between JRC global surface water and our results (Fig. 11). Compared to the JRC water dataset, this study provides a more comprehensive extraction of small and medium-sized lakes (Fig. 11a–d), and more accurate identification of small water bodies (Fig. 11e and f). However, in certain regions with a minimal spectral difference between the image background and surface water, such as aquaculture areas, the extracted extent of surface water in this study is slightly reduced (Fig. 11g and h). Overall, the results of this study can effectively supplement the JRC's water dataset and have a positive impact on water resource investigation and management, as well as flood forecasting and warning.

Fig. 11.

Fig. 11

Comparison of our results with the JRC dataset.

5.2. Limitations and further research

Although this study uses a time series of Landsat images to obtain the distribution of surface water for Anhui Province from 1984 to 2021, there are still some limitations: (1) the primary data used in this study are Landsat images with a resolution of 30m, Due to the limitations of image quality, some small water bodies, especially in mountainous areas and urban river networks, cannot be effectively extracted; (2) the method proposed in this study can effectively extract surface water of different scales, such as rivers, lakes, and reservoirs, but it does not take into account the influence of terrain on water body extraction; (3) and the reliance on visual interpretation for determining water/non-water sample types, although it enhances reliability, may introduce some level of subjectivity. Since the spectral brightness values of water bodies and mountain shadows are generally low, some mountain shadows were misclassified as water bodies, affecting the accuracy of water body extraction.

Therefore, in future research work, on the one hand, we aim to expand available data sources and utilize higher resolution images, such as 10 m resolution Sentinel-1 SAR and Sentinel-2 MSI images, to achieve more accurate surface water extraction results. On the other hand, we will consider using Digital Elevation Model (DEM) data and utilizing GIS technology for mountain shadow calculation. By removing the mountain shadow areas, we can reduce the misclassification rate in water body extraction, and GIS hydrological analysis tools can also be used to extract small rivers in mountainous areas, thus it can supplement the extraction results of small water bodies in those regions. Furthermore, future research could focus on enhancing the automation of the sample generation process to minimize human intervention and potential bias.

6. Conclusions

Quantitative estimation of surface water is essential for the sustainable utilization of regional agriculture and water resources. Time-series Landsat imagery and the GEE platform provide the possibility for accurately estimating the spatial extent of surface water. In this study, a combination of several water indices, bias-corrected fuzzy C-means clustering algorithm, and OTSU threshold segmentation method was employed to obtain the spatial distribution of annual surface water in Anhui from 1984 to 2021. Integrating operations such as background noise suppression and normalization, it effectively improved the limitations of single water index methods in complex background environments, especially in the identification of small water bodies and the completeness of lake extraction.

The regions with the minimum and maximum surface water were identified, and the dynamic changes of surface water in this province were analyzed. The results indicate that: (1) the largest surface water area was in 1998, while the smallest was in 2001, overall showing more surface water in the south and less in the north. (2) Permanent surface water is mainly formed by large rivers, lakes, and reservoirs, such as Chao Lake, the Yangtze and Huai Rivers, with a total area of approximately 4341 km2, accounting for 32.06 % of the total water area. Seasonal surface water is mainly distributed along the edge of lakes, small and medium-sized rivers, with an area of 6661 km2, accounting for 49.15 % of the total water area. (3) Sentinel-2 MSI and Google imagery were used for accuracy verification, the overall accuracy of surface water extraction combined with WI2015 and BCFCM method was 94.06 %, and the Kappa coefficient was 0.88. Compared with the JRC global surface water, the completeness of small to medium-sized lakes and the small rivers extraction is better in the present study.

Data availability statement

Data will be made available on request.

Ethical approval

The authors have unanimously approved the submission of this paper.

CRediT authorship contribution statement

Bingxue Zhao: Writing – review & editing, Writing – original draft, Methodology, Funding acquisition, Conceptualization. Lei Wang: Validation, Investigation, Data curation.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This study was supported by the Improvement Project of Anhui Province (grant number 2022xjzlts029), and the Natural Science Research Project of Chizhou University (grant number CZ2022YJRC06). The authors are grateful to the United States Geological Survey (USGS), the European Space Agency (ESA), and Google Earth Engine (GEE) for providing Landsat-8 and Sentinel-2 images, the China Central Resources for Satellite Data and Applications (CCRSDA) for providing HJ-1 images. Note that any errors or shortcomings in the paper are the responsibility of the authors.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.heliyon.2024.e36660.

Contributor Information

Bingxue Zhao, Email: zhaobingxue302@czu.edu.cn.

Lei Wang, Email: wanglei@czu.edu.cn.

Appendix A. Supplementary data

The following is the Supplementary data to this article:

Multimedia component 1
mmc1.docx (1.4MB, docx)

References

  • 1.Verpoorter C., Kutser T., Seekell D.A., Tranvik L.J. A global inventory of lakes based on high-resolution satellite imagery. Geophys. Res. Lett. 2014;41(18):6396–6402. [Google Scholar]
  • 2.Yang Y.H., Liu Y.X., Zhou M.X., Zhang S.Y., Zhan W.F., Sun C., Duan Y.W. Landsat 8 OLI image based terrestrial water extraction from heterogeneous backgrounds using a reflectance homogenization approach. Remote Sens. Environ. 2015;171:14–32. [Google Scholar]
  • 3.Du Y., Xue H.P., Wu S.J., Ling F., Xiao F., Wei X.H. Lake area changes in the middle Yangtze region of China over the 20th century. J. Environ. Manag. 2011;92(4):1248–1255. doi: 10.1016/j.jenvman.2010.12.007. [DOI] [PubMed] [Google Scholar]
  • 4.Vörösmarty C.J., McIntyre P.B., Gessner M.O., Dudgeon D., Prusevich A., Green P., Glidden S., Bunn S.E., Sullivan C.A., Liermann C.R., Davies P.M. Global threats to human water security and river biodiversity. Nature. 2010;467(7315):555–561. doi: 10.1038/nature09440. [DOI] [PubMed] [Google Scholar]
  • 5.Huang C., Chen Y., Zhang S., Wu J. Detecting, extracting, and monitoring surface water from space using optical sensors: a review. Rev. Geophys. 2018;56(2):333–360. [Google Scholar]
  • 6.Zou Z., Xiao X., Dong J., Qin Y., Doughty R.B., Menarguez M.A., Zhang G., Wang J. Divergent trends of open-surface water body area in the contiguous United States from 1984 to 2016. Proc. Natl. Acad. Sci. U.S.A. 2018;115:3810–3815. doi: 10.1073/pnas.1719275115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Jin H.R., Huang C.Q., Lang M.W., Yeo I.Y., Stehman S.V. Monitoring of wetland inundation dynamics in the Delmarva Peninsula using Landsat time-series imagery from 1985 to 2011. Remote Sens. Environ. 2017;190:26–41. [Google Scholar]
  • 8.Yao F.F., Wang J.D., Wang C., Crétaux J.F. Constructing long-term high-frequency time series of global lake and reservoir areas using Landsat imagery. Remote Sens. Environ. 2019;232 [Google Scholar]
  • 9.Gorelick N., Hancher M., Dixon M., Ilyushchenko S., Thau D., Moore R. Google earth engine: planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 2017;202:18–27. [Google Scholar]
  • 10.Liu X.P., Huang Y.H., Xu X.C., Li X.C., Li X., Ciais P., Lin P.R., Gong K., Ziegler A.D., Chen A.P., Gong P., Chen J., Hu G.H., Chen Y.M., Wang S.J., Wu Q.S., Huang K.N., Estes L., Zeng Z.Z. High-spatiotemporal-resolution mapping of global urban change from 1985 to 2015. Nat. Sustain. 2020;3:564–570. [Google Scholar]
  • 11.Hu Y.F., Dong Y. Batunacun, an automatic approach for land-change detection and land updates based on integrated NDVI timing analysis and the CVAPS method with GEE support. ISPRS J. Photogrammetry Remote Sens. 2018;146:347–359. [Google Scholar]
  • 12.Wang X.X., Xiao X.M., Zou Z.H., Dong J.W., Qin Y.W., Doughty R.B., Menarguez M.A., Chen B.Q., Wang J.B., Ye H., Ma J., Zhong Q.Y., Zhao B., Li B. Gainers and losers of surface and terrestrial water resources in China during 1989–2016. Nat. Commun. 2020;11:3471. doi: 10.1038/s41467-020-17103-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Yao J.X., Wu J., Xiao C.Z., Zhang Z., Li J.Z. The classification method study of crops remote sensing with deep learning, machine learning, and Google earth engine. Rem. Sens. 2022;14:2758. [Google Scholar]
  • 14.Abdali E., Zoej M.J.V., Dehkordi A.T., Ghaderpour E. A parallel-cascaded ensemble of machine learning models for crop type classification in Google earth engine using multi-temporal sentinel-1/2 and landsat-8/9 remote sensing data. Rem. Sens. 2024;16:127. [Google Scholar]
  • 15.Mao D.H., Yang H., Wang Z.M., Song K.S., Thompson J.R., Flower R.J. Reverse the hidden loss of China's wetlands. Science. 2022;376(6597):1061. doi: 10.1126/science.adc8833. [DOI] [PubMed] [Google Scholar]
  • 16.Pekel J.F., Cottam A., Gorelick N., Belward A.S. High-resolution mapping of global surface water and its long-term changes. Nature. 2016;540:418–422. doi: 10.1038/nature20584. [DOI] [PubMed] [Google Scholar]
  • 17.Dong Z., Wang G.J., Amankwah S.O.Y., Wei X.K., Hu Y.F., Feng A.Q. Monitoring the summer flooding in the Poyang Lake area of China in 2020 based on Sentinel-1 data and multiple convolutional neural networks. Int J Appl Earth Obs. 2021;102 [Google Scholar]
  • 18.Li K.W., Xu E.Q. High-accuracy continuous mapping of surface water dynamics using automatic update of training samples and temporal consistency modification based on Google Earth Engine: a case study from Huizhou, China. ISPRS J Photogramm Remote Sens. 2021;179:66–80. [Google Scholar]
  • 19.McFeeters S.K. The use of the normalized difference water index (NDWI) in the delineation of open water features. Int. J. Rem. Sens. 1996;17(7):1425–1432. [Google Scholar]
  • 20.Xu H.Q. Modification of normalized difference water index (NDWI) to enhance open water features in remotely sensed imagery. Int. J. Rem. Sens. 2006;27(14):3025–3033. [Google Scholar]
  • 21.Jain S.K., Singh R., Jain M., Lohani A. Delineation of flood-prone areas using remote sensing techniques. Water Resour. Manag. 2005;19:333–347. [Google Scholar]
  • 22.Sun F., Sun W., Chen J., Gong P. Comparison and improvement of methods for identifying waterbodies in remotely sensed imagery. Int. J. Rem. Sens. 2012;33:6854–6875. [Google Scholar]
  • 23.Feyisa G.L., Meilby H., Fensholt R. Automated water extraction index: a new technique for surface water mapping using Landsat imagery. Remote Sens. Environ. 2014;140(1):23–35. [Google Scholar]
  • 24.Fisher A., Flood N., Danaher T. Comparing Landsat water index methods for automated water classification in eastern Australia. Remote Sens. Environ. 2016;175:167–182. [Google Scholar]
  • 25.Huang X., Xie C., Fang X., Zhang L.P. Combining pixel- and object-based machine learning for identification of water-body types from urban high-resolution remote-sensing imagery. IEEE J STARS. 2015;8:2097–2110. [Google Scholar]
  • 26.Yao F., Wang C., Dong D., Luo J., Shen Z., Yang K. High-resolution mapping of urban surface water using ZY-3 multi-spectral imagery. Remote Sens. 2015;7:12336–12355. [Google Scholar]
  • 27.Mim M.A., Zamil K.M.S. GIS-based analysis of changing surface water in rajshahi city corporation area using support vector machine (SVM), decision tree & random forest technique. J. Mach. Learn. Res. 2018;3:11–17. [Google Scholar]
  • 28.Mayer T., Poortinga A., Bhandari B., Nicolau A.P., Markert K., Thwal N.S., Markert A., Haag A., Kilbride J., Chishtie F., Wadhwa A., Clinton N., Saah D. Deep learning approach for Sentinel-1 surface water mapping leveraging Google Earth Engine. ISPRS J. Photogrammetry Remote Sens. 2021;2 [Google Scholar]
  • 29.Yang H., Wang Z., Zhao H., Guo Y. Water body extraction methods study based on RS and GIS. Procedia Environ Sci. 2011;10:2619–2624. [Google Scholar]
  • 30.Wang C., Jia M.M., Chen N.C., Wang W. Long-term SurfaceWater dynamics analysis based on Landsat imagery and the Google earth engine platform: a case study in the middle Yangtze River Basin. Rem. Sens. 2018;10:1635. [Google Scholar]
  • 31.Zou Z.H., Xiao X.M., Dong J.W., Qin Y.W., Doughty R.B., Menarguez M.A., Zhang G.L., Wang J. Divergent trends of open-surface water body area in the contiguous United States from 1984 to 2016. Proc. Natl. Acad. Sci. USA. 2018;115:3810–3815. doi: 10.1073/pnas.1719275115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Jin S., Liu Y.X., Fagherazzi S., Mi H., Qiao G., Xu W.X., Sun C., Liu Y.C., Zhao B.X., Fichot C.G. River body extraction from sentinel-2A/B MSI images based on an adaptive multi-scale region growth method. Remote Sens. Environ. 2021;255 [Google Scholar]
  • 33.Liang J.Y., Liu D.S. A local thresholding approach to flood water delineation using Sentinel-1 SAR imagery. ISPRS J. Photogrammetry Remote Sens. 2020;159:53–62. [Google Scholar]
  • 34.Otsu N. A threshold selection method from gray-level histograms. IEEE T SYST MAN CY. 1979;9(1):62–66. [Google Scholar]
  • 35.Lehner B., Doll P. Development and validation of a global database of lakes, reservoirs and wetlands. J. Hydrol. 2004;296:1–22. [Google Scholar]
  • 36.Carroll M.L., Townshend J.R., DiMiceli C.M., Noojipady P., Sohlberg R.A. A new global raster water mask at 250 m resolution. Int J Digit Earth. 2009;2(4):291–308. [Google Scholar]
  • 37.Yamazaki D., A Trigg M. The dynamics of Earth's surface water. Nature. 2016;540:348–349. doi: 10.1038/nature21100. [DOI] [PubMed] [Google Scholar]
  • 38.Pickens A.H., Hansen M.C., Hancher M., Stehman S.V., Sherani Z. Mapping and sampling to characterize global inland water dynamics from 1999 to 2018 with full Landsat time-series. Remote Sens. Environ. 2020;243 [Google Scholar]
  • 39.Feng M., Sexton J.O., Channan S., Townshend J.R. A global, high-resolution (30-m) inland water body dataset for 2000: first results of a topographic–spectral classification algorithm. Int J Digit Earth. 2015;9:113–133. [Google Scholar]
  • 40.Liu H., Gong P., Wang J., Clinton N., Bai Y.Q., Liang S.L. Annual dynamics of global land cover and its long-term changes from 1982 to 2015. Earth Syst. Sci. Data. 2020;12:1217–1243. [Google Scholar]
  • 41.Brown C.F., Brumby S.P., Williams B.G., B T., Hyde S.B., Mazzariello J., Czerwinski W., Pasquarella V.J., Haertel R., Ilyushchenko S., Schwehr K., Weisse M., Stolle F., Hanson C., Guinan O., Moore R., Tait A.M. Dynamic World, Near real-time global 10m land use land cover mapping. Sci. Data. 2022;9:251. [Google Scholar]
  • 42.Karra K., Kontgis C., Statman-Weil Z., Mazzariello J.C., Mathis M., Brumby S.P. Global land use/land cover with Sentinel 2 and deep learning. IEEE International Geoscience and Remote Sensing Symposium. 2021:4704–4707. [Google Scholar]
  • 43.Wu Q.S., Lane C.R., Li X.C., Zhao K.G., Zhou Y.Y., Clinton N., DeVries B., Golden H.E., Lang M.W. Integrating LiDAR data and multi-temporal aerial imagery to map wetland inundation dynamics using Google Earth Engine. Remote Sens. Environ. 2019;228:1–13. doi: 10.1016/j.rse.2019.04.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Zhu Z., Wang S., Woodcock C.E. Improvement and expansion of the Fmask algorithm: cloud, cloud shadow, and snow detection for Landsats 4–7, 8, and Sentinel 2 images. Remote Sens. Environ. 2015;159:269–277. [Google Scholar]
  • 45.Dehkordi A.T., Zoej M.J.V., Ghasemi H., Jafari M., Mehran A. Monitoring long-term spatiotemporal changes in Iran surface waters using Landsat imagery. Rem. Sens. 2022;14:4491. [Google Scholar]
  • 46.Biosca J.M., Lerma J.L. Unsupervised robust planar segmentation of terrestrial laser scanner point clouds based on fuzzy clustering methods. ISPRS J. Photogrammetry Remote Sens. 2008;63(1):84–98. [Google Scholar]
  • 47.Zhong Y., Zhang S., Zhang L. Automatic fuzzy clustering based on adaptive multi-objective differential evolution for remote sensing imagery. IEEE J STARS. 2013;6(5):2290–2301. [Google Scholar]
  • 48.Serra J. Academic Press.; 1982. Image Analysis and Mathematical Morphology. [Google Scholar]
  • 49.Dehkordi A.T., Zoej M.J.V., Ghasemi H., Ghaderpour E., Hassan Q.K. A new clustering method to generate training samples for supervised monitoring of long-term water surface dynamics using Landsat data through Google earth engine. Sustainability. 2022;14:8046. [Google Scholar]
  • 50.Deng Y., Jiang W.G., Tang Z.H., Ling Z.Y., Wu Z.F. Long-term changes of open-surface water bodies in the Yangtze River Basin based on the Google earth engine cloud platform. Rem. Sens. 2019;11:2213. [Google Scholar]
  • 51.Li Z.C., Feng Y.J., Dessay N., Delaitre E., Gurgel H., Gong P. Continuous monitoring of the spatio-temporal patterns of surface water in response to land use and land cover types in a mediterranean lagoon complex. Rem. Sens. 2019;11:1425. [Google Scholar]
  • 52.Tulbure M.G., Broich M., Stehman S.V., Kommareddy A. Surface water extent dynamics from three decades of seasonally continuous Landsat time series at subcontinental scale in a semi-arid region. Remote Sens. Environ. 2016;178:142–157. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia component 1
mmc1.docx (1.4MB, docx)

Data Availability Statement

Data will be made available on request.


Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES