Skip to main content
Scientific Data logoLink to Scientific Data
. 2022 Mar 2;9:66. doi: 10.1038/s41597-022-01168-x

Vectorized rooftop area data for 90 cities in China

Zhixin Zhang 1,2,3, Zhen Qian 1,2,3, Teng Zhong 1,2,3, Min Chen 1,2,3,4,, Kai Zhang 1,2,3, Yue Yang 1,2,3, Rui Zhu 5, Fan Zhang 6, Haoran Zhang 7,8,9, Fangzhuo Zhou 1,2,3, Jianing Yu 1,2,3, Bingyue Zhang 1,2,3, Guonian Lü 1,2,3, Jinyue Yan 9,10
PMCID: PMC8891309  PMID: 35236863

Abstract

Reliable information on building rooftops is crucial for utilizing limited urban space effectively. In recent decades, the demand for accurate and up-to-date data on the areas of rooftops on a large-scale is increasing. However, obtaining these data is challenging due to the limited capability of conventional computer vision methods and the high cost of 3D modeling involving aerial photogrammetry. In this study, a geospatial artificial intelligence framework is presented to obtain data for rooftops using high-resolution open-access remote sensing imagery. This framework is used to generate vectorized data for rooftops in 90 cities in China. The data was validated on test samples of 180 km2 across different regions with spatial resolution, overall accuracy, and F1 score of 1 m, 97.95%, and 83.11%, respectively. In addition, the generated rooftop area conforms to the urban morphological characteristics and reflects urbanization level. These results demonstrate that the generated dataset can be used for data support and decision-making that can facilitate sustainable urban development effectively.

Subject terms: Geography, Environmental sciences


Measurement(s) building rooftop area
Technology Type(s) computational modeling technique
Sample Characteristic - Environment city
Sample Characteristic - Location China

Background & Summary

Rooftops of buildings have been intensively studied in fields such as sustainable urban development, building energy modeling, and urban planning and design in recent decades13. Owing to urbanization associated with the digital age, reliable information on rooftops is in increasing demand46. The rapid access to accurate rooftop information is important for the evaluation of urban and rural development trends. These trends are useful for formulating development strategies and protecting urban and rural ecosystems79. However, data on rooftop areas are unavailable in many developing countries because of resource constraints. Therefore, methods suitable for generating reliable data on rooftop areas of buildings at low cost are urgently needed1012.

The automatic extraction of rooftop area data is gaining popularity in diverse fields, and studies involving varied data sources exist13. Three-dimensional (3D) spatial data, such as the Digital Surface Model (DSM) and Light Detection and Ranging (LiDAR), are exploited for reconstructing buildings, which includes the rooftop area representation and geometric modeling1316. However, the costs of acquiring 3D spatial data and of constructing the associated 3D models are costly, especially at the city scale. Due to the development of image processing algorithms, such as the edge detection and image segmentation, rooftops data can be extracted from high-resolution remote sensing imagery17,18. Conventional image process techniques, however, involve complex empirical rules and threshold settings, and thus, exhibit limitations when applied to high-resolution remote sensing imagery in large-scale14.

Open-access data from public service providers, such as Google Earth, Baidu Map, and OpenStreetMap, provide opportunities for the acquisition of urban information associated with broad coverage, fast updating speeds, and low cost1921. However, although open-access data, for example, from a Google Earth Satellite (GES) image are valuable for obtaining information on cities, conventional processing methods hardly discover in-depth semantic information and lack flexibility when examining large data involving complex features22.

In recent years, deep learning methods have been employed for efficient feature learning and urban information acquisition23,24. In fact, deep learning-based image semantic segmentation methods have been applied for the extraction of rooftops data25,26. Nevertheless, regarding the optimization of rooftop data extraction applications, the data acquisition process requires the incorporation of geographic information27,28. In addition, public rooftop area datasets that are suitable for use as training dataset in machine learning are scant29. Further, naive deep learning models based on unbalanced and insufficient training samples exhibit unsatisfactory performances30. Therefore, a robust high-performance rooftop extraction model remains elusive in China.

In the present study, the main objective is to extract accurate rooftop areas in China using high-resolution open-access remote sensing imagery based on a geospatial artificial intelligence (GeoAI) framework. The principal components of this framework are illustrated in Fig. 1. The following steps were employed for generating the rooftop area dataset: (1) data preparation through spatial stratified sampling involving geospatial prior knowledge and data processing pipeline to augment the representativeness and number of samples; (2) creation of a deep learning segmentation model, which is based on an ensemble learning strategy and an improved prediction method to improve the rooftop extraction performance.

Fig. 1.

Fig. 1

The framework of obtaining rooftop area data in China.

Based on the framework, we developed a national-scale vectorized rooftop area dataset involving 90 cities in China. The data was validated on test samples of 180 km2 across different regions with spatial resolution, overall accuracy, and F1 score of 1 m, 97.95%, and 83.11%, respectively.

Methods

Data collection

GES imagery

In the present study, open-access GES imagery served as the primary data source (Table 1). GES imagery is advantageous because of its high resolution and wide coverage. These images were downloaded in October 2020 using Python scripts in the open map service application program interface (Google Earth API) provided by Google. The spatial resolution of GES imagery varies according to its image level. The spatial resolution of the 18-level GES imagery was approximately 0.6 m/pixel in most developed countries, and this adequately displayed the geometry and structure of different rooftops. However, because the GES images originated from multiple sensors, differences in their performance induced variations in the resolution from region to region. For example, in China, images of major cities are usually obtained from the WorldView, and QuickBird satellites, and these involve an original resolution within 1 m. In contrast, images of remote areas are acquired from the SPOT series satellites, and the original resolution of which is within 5 m.

Table 1.

Data type, provided information, and the source used for accessing data involved in the present study.

Data Information Source
GES imagery high spatial resolution satellite imagery data https://www.google.com/earth
FROM-GLC30 30-m spatial resolution global land cover data http://data.ess.tsinghua.edu.cn

FROM-GLC30

The spatial stratified sampling standard was based on a priori knowledge of the urban land cover from the global 30 m resolution land cover data (FROM-GLC30) created in 2017 (Table 1). These data include the following types: cultivated land, woodland, grassland, shrubland, wetland, waterbody, tundra, artificial surface, bare land, glacier, and permafrost. The overall accuracy of the FROM-GLC30 data is 72.43%31, which is based on a global all-season validation sample set from more than 36,000 locations.

Site selection

In the present study, we selected 90 cities in China (Table 2), and these were partitioned into four tiers based on the city administration hierarchy established by the government in China. Tier 1 involved municipalities with a central administration and regions linked to a special administration in China. Tier 2 comprised mainly sub-provincial cities, while Tier 3 involved provincial capitals and major prefecture-level cities. Tier 4 contained ordinary prefecture-level cities, and the locations of these 90 cities are displayed in Fig. 2. These cities involve all provincial capitals and major administrative levels in China, and their distribution covers different climate regions, and thus, highlight the economics, political, and geographic adequacy. In terms of population, the 90 selected cities cover about 40% of the entire China.

Table 2.

Data for the 90 cities in China involved in the present study.

Tier 1 (Count: 6) Tier 2 (Count: 14) Tier 3 (Count: 24) Tier 4 (Count: 46)
Name Code Name Code Name Code Name Code Name Code
Macao 101 Chengdu 201 Anshan 301 Ankang 401 Bazhong 402
Beijing 102 Guangzhou 202 Baotou 302 Baiyin 403 Baise 404
Chongqing 103 Harbin 203 Datong 303 Changde 405 Chaozhou 406
Shanghai 104 Hangzhou 204 Fuzhou 304 Chifeng 407 Dali 408
Tianjin 105 Jinan 205 Guiyang 305 Datong 409 Dongguan 410
Hong Kong 106 Nanjing 206 Haikou 306 Ganzhou 411 Guigang 412
Ningbo 207 Hefei 307 Haidong 413 Heyuan 414
Qingdao 208 Hohhot 308 Hebi 415 Hengshui 416
Xiamen 209 Jilin 309 Jixi 417 Jining 418
Shenzhen 210 Kunming 310 Jiangmen 419 Jingmen 420
Shenyang 211 Lhasa 311 Jiujiang 421 Karamay 422
Wuhan 212 Lanzhou 312 Lijiang 423 Liupanshui 424
Xi’an 213 Nanchang 313 Nanchong 425 Nanping 426
Changchun 214 Nanning 314 Pingxiang 427 Qinzhou 428
Qiqihar 315 Rizhao 429 Sanya 430
Shijiazhuang 316 Shannan 431 Songyuan 432
Suzhou 317 Tongliao 433 Tongling 434
Taiyuan 318 Weifang 435 Wenzhou 436
Urumqi 319 Yan’an 437 Yancheng 438
Xining 320 Yichang 439 Yulin (Guangxi Province) 440
Yinchuan 321 Yuxi 441 Yuncheng 442
Changsha 322 Zhangye 443 Zhaotong 444
Zhengzhou 323 Zhongwei 445 Zigong 446
Zibo 324

Fig. 2.

Fig. 2

The 90 selected cities in China organized in a hierarchical of four tiers.

Information about the characteristics of the 90 cities in each tier is presented in Table 3 (collated from national statistics32,33, no data available for Hong Kong and Macau). The administrative center bias in resource allocation makes cities with higher administrative rank more advantageous in terms of expansion. Therefore, the average size of population and built district shows an increasing trend from Tier 4 to Tier 1. Cities in Tier 1 and Tier 2 are mostly the economic centers of provinces, with developed economies and high urbanization rates, and usually show a multi-core urban morphology. Cities in Tier 3 and Tier 4, on the other hand, usually have a single-core urban morphology. In addition, the shape of cities in each tier generally varies depending on the topography. Cities in the plains tend to have a clumped shape, while cities along rivers and valleys tend to have a striped shape.

Table 3.

Characteristics information of the 90 cities in different tiers.

Characteristics Tier 1 Tier 2 Tier 3 Tier 4
Area of administrative district (km2) MIN 6,340.50 1,516.00 2,315.00 1,918.00
AVE 29,271.87 13,452.85 14,799.38 19,749.02
MAX 82,370.00 53,186.00 44,287.00 90,064.00
Area of built district (km2) MIN 1,151.05 354.79 87.27 10.80
AVE 1,343.34 714.08 296.36 131.11
MAX 1,515.41 1,324.17 580.75 1,194.31
Permanent population (Ten thousand) MIN 1,386.60 516.40 86.79 35.40
AVE 2,317.11 1,183.23 587.13 362.64
MAX 3,205.42 2,093.78 1,274.83 1,046.66

Data preparation

GES imagery preprocessing

The quality of a GES image varies based on the imaging sensor, imaging time, and environmental factors (e.g., atmospheric condition and climate), all of which affect the model training and generalization. Therefore, standardization procedure for GES imagery is necessary, and in the present study, the Gamma Correction algorithm34 and Contrast Limited Adaptive Histogram Equalization algorithm35 were used to resolve brightness and sharpness issues.

Spatial stratified sampling strategy

The study area involves several land cover types, and the proportion of unbuilt areas including water, cultivated land, and forest, exceeds that of built-up areas. Therefore, regular random sampling will create the imbalanced category problem, in which the proportion of negative samples (non-rooftop samples) significantly surpasses that of positive samples (rooftop samples). Therefore, the priori knowledge of the urban land cover was utilized to partition the study area into built-up and unbuilt areas based on the FROM-GLC30 data. The built-up area contains mainly artificial surfaces, which easily yield positive samples. In contrast, the unbuilt area comprises water bodies, wetlands, grasslands, bare lands, cultivated lands, shrublands, and forests, which commonly provide negative samples.

However, based on empirical evidence from previous studies in unbuilt areas, we find farmlands, bare lands, and intersections of different land cover types can be misclassified by the rooftop extraction model. Therefore, in the present study, the unbuilt areas were manually filtered to determine confusing areas, and these areas supplied confusing negative samples.

Therefore, to obtain representative and balanced positive and negative samples, spatial stratified sampling was employed in the sample acquisition. The results of stratified sampling in the built-up and confusing areas are displayed in Fig. 3, and this approach was used to produce patches from the GES images.

Fig. 3.

Fig. 3

Redundant information of stratified sampling in the study area.

Sample processing and division

In the present study, sample images were correspondingly labeled positive and negative manually. Patches of the image samples and the corresponding masks were then divided into training samples (served as input for the rooftop extraction model training) and test samples (served as ground truth for the validation of rooftop area dataset).

During the rooftop extraction model training task, the built-up area covered by the image samples and their corresponding masks was 200 km2, while the confusing area was 160 km2, and thus, the total training samples covering 360 km2. According to previous deep learning studies, higher training data volume produce more robust models. However, in practice, available data are usually limited. To resolve this limitation, data augmentation without changing the labeled categories was conducted, thereby enhancing the generalization potential of the model. Data augmentation operations used in the present study included the following: random cropping, image rotation, image flipping, image blurring, and noise addition.

Modelling

Naive rooftop extraction model

Rooftop areas of cities were extracted using the DeepLabV3+ model and GES images. The DeepLabV3+ is an open-source image semantic segmentation model that was launched by the Google R&D team36. In GES images, the rooftop areas exhibit varied sizes and shapes because of differences in architectural styles and dimensions37. Rooftop edges are sometimes difficult to accurately identify because the GES image quality is affected by weather conditions38. DeepLabV3 + can perceive features of different scales, thereby improving the recognition accuracy for multifarious rooftops. In addition, the DeepLabV3+ enables transformation of the feature map into a constant resolution map based on the encoder–decoder structure, and this resolves the blurred edges of the rooftop area extraction issue39.

In the present study, the cross-entropy40 and dice loss functions41 were integrated to generate a composite loss function that can simultaneously handle the imbalance categories of samples problem and accelerate the convergence of training. These functions are expressed as follows:

Lpi,pi=αLdicepi,pi+βLbcepi,pi 1

where pi denotes the predicted value of the ith sample, pi represents the ground truth value of the ith sample, Ldice(·) is the dice loss function, Lbce(·) stands for the cross-entropy loss function, and α and β are weight coefficients of the loss function, with corresponding values of 0.2 and 0.8 in the present study.

Expansion prediction

To extract rooftop areas, standardized GES images served as input for the ensemble model. Considering that original GES images were significantly larger than the required dimension for model input, cropping into smaller patches was necessary for the prediction. However, this creates an uneven transition or stitched problem at the splicing gap of the prediction result of cropped smaller images42. The expansion prediction techniques are suitable for eliminating this uneven transition at the splicing gap.

Steps implemented in the expansion prediction (Fig. 4) include the following: (1) An n × n sliding window characterized by a step size of n was created. During movement, this window expanded to m × m, and the original remote sensing image was then cropped into small patches. (2) The cropped images were concatenated into a tensor. (3) The ensemble rooftop extraction model was then used to predict the tensor obtained in step 2. (4) The central portion of each n x n tensor element was extracted and split into patches. (5) Patches obtained in step 4 were stitched to produce a large predicted image, which was then cropped based on the original GES image.

Fig. 4.

Fig. 4

Illustration of steps involved in the expansion prediction method.

Model training

Overall network parameters were fine-tuned using the MS COCO dataset43 and pre-trained Xception backbone network44, while the AdamW optimizer45 and Cosine Annealing Warm Restarts algorithm46 were employed for rapid convergence of the network. Four Tesla V100 GPUs were used to train the model using the mixed accuracy method, and the development was based on the Pytorch framework47. The hyperparameter settings are presented in Table 4.

Table 4.

Summary of data for hyperparameters utilized in the present study.

Hyperparameter Value
Learning rate 0.02
Weight decay 0.0005
T0 2
Tmult 2
Output stride 16
Size of input image 384

The parameter T0 refers to number of iterations in the first restart, while Tmult denotes the increase factor in the Cosine Annealing Warm Restarts.

After the model reaches convergence, rooftop areas were predicted using the snapshot48 based on union ensemble rule and expansion prediction techniques, the union ensemble rule is defined in Eq. 2. Considering the trade-off between the quality and speed of prediction, two local optimal models (in this work, checkpoints of the 29th and 55th epochs were chosen) were integrated into the ensemble rooftop extraction model. In addition, the TensorRT framework was used to accelerate the model calculation based on the FP16 accuracy. Accordingly, the prediction speed was increased approximately 5-fold without a significant loss in the accuracy, thereby reducing the prediction time from 100 to 20 days.

Punionensemble=i=1NPi 2

where Pi denotes the matrix of the prediction category, Punionensemble represents the result of the union integration, and N is the number of sub models.

Data Records

Data file path

The rooftop area dataset will be updated regularly corresponding to urbanization process of different cities in China. The dataset, metadata, and detailed documentation are freely available for all users at National Tibetan Plateau Data Center (10.11888/Geogra.tpdc.271702)49.

Data file format

The rooftop area dataset is arranged on the tier of cities, where rooftop area data of each city is ESRI Shapefile format50, which is composed of .shx, .shp, .prj, .dbf, and .cpg files. The dataset is divided into the original version and the simplified version. The original version is converted by prediction results from model directly, and on which simplified version is obtained by using the Douglas–Peucker algorithm51. The total sizes of two version dataset are 118 GB and 21.3 GB without compression.

File structure

The rooftop area data of each city contains three fields (area, X and Y), as presented in Table 5. All area fields are double float format, in square meters, calculated in CGCS 2000 Albers geographic coordinates. All X and Y fields are double float format, in decimal degrees, calculated in WGS 1984 Web Mercator Auxiliary Sphere geographic coordinates.

Table 5.

Field description for rooftop area dataset.

Field Format Definition Unit Geographic reference
Area double float Area of each rooftop feature Square meter CGCS 2000 Albers
X double float Longitude of the central point of each rooftop feature Decimal degree WGS 1984 Web Mercator Auxiliary Sphere
Y double float Latitude of the central point of each rooftop feature Decimal degree WGS 1984 Web Mercator Auxiliary Sphere

Technical Validation

Sampling design

Based on the spatial stratified sampling method proposed in this study, the test samples used for the validation of the rooftop area dataset are obtained from GES images and manually labeled with the ground truth through visual interpretation. To better reflect the quality of the rooftop extraction results in each city tier, four tiers of 45 km2 were created, and these produced a test dataset covering 180 km2, in which the built-up and the confusing areas covered by the image samples and their corresponding masks were 100 and 80 km2, respectively.

Analysis design

Qualitative and quantitative evaluation criteria were utilized to validate the rooftop area dataset generated. Regarding the qualitative evaluation, morphological and topological characteristics of ground truth and extracted rooftop area data for various city tiers were compared. Conversely, for the quantitative evaluation, a testing dataset covering 180 km2 comprising four city tiers of 45 km2 was employed. In the present study, indicators calculated based on the confusion matrix52 include accuracy, precision, recall, and F1 score were used. The closer the values of these indicators are to 100%, the higher the quality of the rooftop extraction. It should be mentioned that precision is equivalent to user accuracy, which is a measure of exactness, and recall is equivalent to producer accuracy, which is a measure of completeness. Precision and recall can also be communicated in terms of error, as either commission error (1-precision) or omission error (1-recall). In particular, the F1 score, which is a weighted average of the recall and precision, is an important indicator for comprehensive evaluation of rooftop extraction results. These data and indicators were then used to evaluate the relationship between the rooftop extraction results and the ground truth53.

Validation results

The validation results for various city tiers are presented in Table 6. The overall accuracy for all city tiers is 97.95%, while the F1 score is 83.11%. However, the overall quality of the rooftop area dataset for high-tier cities is better than that of the lower-tier cities. Based on empirical exploration, we realized that the GES image quality for various cities varied according to the imaging sensor, imaging time, and environmental factors such as the atmospheric condition and climate. These image quality differences affect the model training and generalization. The quality of GES images for high-tier cities is better, so the results of the model extraction are generally better, which explains the quality difference between the rooftop area dataset for different city tiers. Therefore, in using this dataset for other applications, an evaluation of the impact of regional differences in quality on specific applications is necessary.

Table 6.

Summarized data from the evaluation of rooftop extraction results associated with different city tiers.

City tier Accuracy (%) F1 score (%) Producer accuracy/Recall (%) User accuracy/Precision (%) Omission error (%) Commission error (%)
Tier 1 98.17 85.58 83.70 87.54 16.30 12.46
Tier 2 97.60 83.57 79.65 87.89 20.35 12.11
Tier 3 98.16 83.45 78.43 89.17 21.57 10.83
Tier 4 97.95 82.13 78.21 86.46 21.79 13.54
Overall 97.95 83.11 78.96 87.77 21.04 12.23

In addition, we validated the quality of the rooftop area dataset for different city tiers using two types of sampling areas, as shown in Fig. 5. Each submap corresponds to a ground extent of 1 km2 and the indicators for evaluating the rooftop extraction results are given below. In addition, we visualized the elements TN (True Negative), TP (True Positive), FN (False Negative) and FP (False Positive) in the confusion matrix. In this way the commission and omission errors in the extraction results can be clearly indicated by FP (in blue) and FN (in orange), respectively. It can be seen that the rooftop extraction in the built-up area is better than that in the confusing area, which explains why we use a stratified sampling strategy when collecting samples. In general, the extraction results adequately delineate characteristics of the rooftops, and these are consistent with the GES images. Our dataset also comprises small, sparsely distributed, and irregularly structured rooftops in confusing area, which highlights that the rooftop area dataset generated by the proposed framework has good performance in finding details and effectively avoids the interference of complex background information.

Fig. 5.

Fig. 5

Images highlighting the evaluation of the rooftop area dataset for different city tiers and sampling areas, using different colors to visualize omission and commission errors.

Examples of rooftop area dataset for different city tiers are shown in Fig. 6. The extracted rooftop areas exhibit clustering, and this is consistent with the spatial morphology and city boundaries. Therefore, the spatial distribution of rooftop areas is supportive for understanding the urbanization level and urban planning needs of cities. For example, Shanghai is a high-density mega-city in China, and its frank and vast plains and numerous waterways and ports provide advantages for the establishment of its polycentric urban spatial system. The concept of intensive development has also resulted in a more compact urban space in Shanghai. However, the eastern and northern parts of Harbin are mountainous and hilly, and the expansion of the city is limited by natural conditions, so the urban space is loose and the urban area is mainly distributed in the central and western plains.

Fig. 6.

Fig. 6

Images highlighting the integrity of the rooftop area dataset for different city tiers, displaying results in both urban and rural space within city boundaries.

Specific details on the rooftop area dataset for different city tiers are displayed in Fig. 7 using Beijing, Nanjing, Jilin, and Hengshui as examples. Compared with the GES images, the rooftop extraction results display significant details for various cities, with rooftops and complex backgrounds effectively distinguished. Moreover, dense and sparse spatial distribution areas are accurately extracted. In fact, the extraction results exhibit no sign of the stitched problem because of the application of the expansion prediction.

Fig. 7.

Fig. 7

Images for extracted rooftop areas in different cities, indicating the position offset on buildings of different heights.

Meanwhile, Fig. 7 reveals that the GES image used in the present study was not adequately orthorectified, and this partially accounts for the offset in positions between the extracted rooftop area and the ground truth. This offset in positions is significant (approximately 20 m based on empirical survey) for a few high-rise buildings, but it is negligible for low- and medium-rise buildings. Therefore, if the rooftop area data are used for a major city- or country-scale estimation, such position offsets will minimally impact the results. Nevertheless, the rooftop extraction results obtained in the present study provide high-quality details suitable for supporting many architecture-oriented applications.

The empirical evidence of our study in 90 Chinese cities shows that the proposed method can be quickly generalized at a large scale and shows strong robustness in regions with different characteristics. When it is to be extended to regions outside of China, we suggest fine-tuning the existing model by adding new feature samples, to enhance its applicability in the target region.

Acknowledgements

We appreciate the detailed suggestions and comments from the anonymous reviewers. We express heartfelt thanks to the other members of the Smart City Sensing and Simulation lab, who undertook data collection and annotation work. This work was supported by the National Natural Science Foundation (Grant 41930648), Joint Fund Project of National Natural Science Foundation of China (Grant U1811464), and National Research Foundation Singapore. The data of this work is licensed and hosted by National Tibetan Plateau Data Center.

Author contributions

Z.X.Z. investigated study area and related work, designed the geospatial stratified sampling algorithm, collected and processed GES image and wrote manuscript. Z.Q. designed, trained and applied rooftop extraction model, developed parallel computing programs, simplified rooftop area dataset, and wrote manuscript. T.Z. assisted with validation and quality control and reviewed manuscript. M.C. designed and condensed experimental ideas, guided experiments work, reviewed manuscript, and provided project funding support. K.Z. collected and processed GES image, and published rooftop area dataset. Y.Y. converted rooftop area dataset from raster format to vector format and runed data preprocessing procedure using a supercomputer platform. R.Z., F.Z. and H.R.Z. provided technical guidance and reviewed manuscript. F.Z.Z, J.N.Y and B.Y.Z processed and quality controlled the GES image. G.N.L. and J.Y.Y. designed and condensed experimental ideas and reviewed manuscript.

Code availability

The procedure of spatial sampling is executed in the ArcGIS Pro platform. The code of the deep learning model is available at https://github.com/ChanceQZ/RoofTopSegmatation. The program is described by Python3, packages of which are Pytroch, Numpy, and OpenCV mainly.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Biljecki F, Heuvelink GB, Ledoux H, Stoter J. Propagation of positional error in 3D GIS: estimation of the solar irradiation of building roofs. International Journal of Geographical Information Science. 2015;29:2269–2294. doi: 10.1080/13658816.2015.1073292. [DOI] [Google Scholar]
  • 2.Nouvel R, Zirak M, Coors V, Eicker U. The influence of data quality on urban heating demand modeling using 3D city models. Computers, Environment and Urban Systems. 2017;64:68–80. doi: 10.1016/j.compenvurbsys.2016.12.005. [DOI] [Google Scholar]
  • 3.Chen Y, Tang L, Yang X, Bilal M, Li Q. Object-based multi-modal convolution neural networks for building extraction using panchromatic and multispectral imagery. Neurocomputing. 2020;386:136–146. doi: 10.1016/j.neucom.2019.12.098. [DOI] [Google Scholar]
  • 4.Biljecki F, Arroyo Ohori K, Ledoux H, Peters R, Stoter J. Population estimation using a 3D city model: A multi-scale country-wide study in the Netherlands. PloS one. 2016;11:e0156808. doi: 10.1371/journal.pone.0156808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Acuto M, Parnell S, Seto KC. Building a global urban science. Nature Sustainability. 2018;1:2–4. doi: 10.1038/s41893-017-0013-9. [DOI] [Google Scholar]
  • 6.Hu Q, Zhen L, Mao Y, Zhou X, Zhou G. Automated building extraction using satellite remote sensing imagery. Automation in Construction. 2021;123:103509. doi: 10.1016/j.autcon.2020.103509. [DOI] [Google Scholar]
  • 7.Assouline D, Mohajeri N, Scartezzini JL. Quantifying rooftop photovoltaic solar energy potential: A machine learning approach. Solar Energy. 2017;141:278–296. doi: 10.1016/j.solener.2016.11.045. [DOI] [Google Scholar]
  • 8.Assouline D, Mohajeri N, Scartezzini JL. Large-scale rooftop solar photovoltaic technical potential estimation using Random Forests. Applied energy. 2018;217:189–211. doi: 10.1016/j.apenergy.2018.02.118. [DOI] [Google Scholar]
  • 9.Dehwah AH, Asif M. Assessment of net energy contribution to buildings by rooftop photovoltaic systems in hot-humid climates. Renewable energy. 2019;131:1288–1299. doi: 10.1016/j.renene.2018.08.031. [DOI] [Google Scholar]
  • 10.Chen Y, Hong T, Piette MA. Automatic generation and simulation of urban building energy models based on city datasets for city-scale building retrofit analysis. Applied Energy. 2017;205:323–335. doi: 10.1016/j.apenergy.2017.07.128. [DOI] [Google Scholar]
  • 11.Mohajeri N, et al. A city-scale roof shape classification using machine learning for solar energy applications. Renewable Energy. 2018;121:81–93. doi: 10.1016/j.renene.2017.12.096. [DOI] [Google Scholar]
  • 12.Shepero M, Munkhammar J, Widén J, Bishop JD, Boström T. Modeling of photovoltaic power generation and electric vehicles charging on city-scale: A review. Renewable and Sustainable Energy Reviews. 2018;89:61–71. doi: 10.1016/j.rser.2018.02.034. [DOI] [Google Scholar]
  • 13.Chen D, Wang R, Peethambaran J. Topologically aware building rooftop reconstruction from airborne laser scanning point clouds. IEEE Transactions on Geoscience and Remote Sensing. 2017;55:7032–7052. doi: 10.1109/TGRS.2017.2738439. [DOI] [Google Scholar]
  • 14.Huang J, Zhang X, Xin Q, Sun Y, Zhang P. Automatic building extraction from high-resolution aerial images and LiDAR data using gated residual refinement network. ISPRS journal of photogrammetry and remote sensing. 2019;151:91–105. doi: 10.1016/j.isprsjprs.2019.02.019. [DOI] [Google Scholar]
  • 15.Gooding J, Crook R, Tomlin AS. Modelling of roof geometries from low-resolution LiDAR data for city-scale solar energy applications using a neighbouring buildings method. Applied Energy. 2015;148:93–104. doi: 10.1016/j.apenergy.2015.03.013. [DOI] [Google Scholar]
  • 16.Poli D, Remondino F, Angiuli E, Agugiaro G. Radiometric and geometric evaluation of GeoEye-1, WorldView-2 and Pléiades-1A stereo images for 3D information extraction. ISPRS Journal of Photogrammetry and Remote Sensing. 2015;100:35–47. doi: 10.1016/j.isprsjprs.2014.04.007. [DOI] [Google Scholar]
  • 17.Kabir MH, Endlicher W, Jägermeyr J. Calculation of bright roof-tops for solar PV applications in Dhaka Megacity, Bangladesh. Renewable Energy. 2010;35:1760–1764. doi: 10.1016/j.renene.2009.11.016. [DOI] [Google Scholar]
  • 18.Hazelhoff, L. In Visual Information Processing and Communication II, 7882, 788208 (International Society for Optics and Photonics, 2011).
  • 19.Taylor JR, Lovell ST. Mapping public and private spaces of urban agriculture in Chicago through the analysis of high-resolution aerial images in Google Earth. Landscape and urban planning. 2012;108:57–70. doi: 10.1016/j.landurbplan.2012.08.001. [DOI] [Google Scholar]
  • 20.Yu L, Gong P. Google Earth as a virtual globe tool for Earth science applications at the global scale: progress and perspectives. International Journal of Remote Sensing. 2012;33:3966–3986. doi: 10.1080/01431161.2011.636081. [DOI] [Google Scholar]
  • 21.Liang J, Gong J, Li W. Applications and impacts of Google Earth: A decadal review (2006–2016) ISPRS Journal of Photogrammetry and Remote Sensing. 2018;146:91–107. doi: 10.1016/j.isprsjprs.2018.08.019. [DOI] [Google Scholar]
  • 22.Ye Y, et al. Measuring daily accessed street greenery: A human-scale approach for informing better urban planning practices. Landscape and Urban Planning. 2019;191:103434. doi: 10.1016/j.landurbplan.2018.08.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F. & Adam, H. In Proceedings of the European conference on computer vision (ECCV). 801–818 (2018).
  • 24.Chen G, et al. Symmetrical dense-shortcut deep fully convolutional networks for semantic segmentation of very-high-resolution remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing. 2018;11:1633–1644. doi: 10.1109/JSTARS.2018.2810320. [DOI] [Google Scholar]
  • 25.Long, J., Shelhamer, E. & Darrell, T. in Proceedings of the IEEE conference on computer vision and pattern recognition. 3431–3440 (2015).
  • 26.Noh, H., Hong, S. & Han, B. in Proceedings of the IEEE international conference on computer vision. 1520–1528 (2015).
  • 27.Li P, et al. Understanding rooftop PV panel semantic segmentation of satellite and aerial images for better using machine learning. Advances in Applied Energy. 2021;4:100057. doi: 10.1016/j.adapen.2021.100057. [DOI] [Google Scholar]
  • 28.Zhong T, et al. A city-scale estimation of rooftop solar photovoltaic potential based on deep learning. Applied Energy. 2021;298:117132. doi: 10.1016/j.apenergy.2021.117132. [DOI] [Google Scholar]
  • 29.Shorten C, Khoshgoftaar TM. A survey on image data augmentation for deep learning. Journal of Big Data. 2019;6:1–48. doi: 10.1186/s40537-019-0197-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Wang Y, Yao Q, Kwok JT, Ni LM. Generalizing from a few examples: A survey on few-shot learning. ACM Computing Surveys (CSUR) 2020;53:1–34. [Google Scholar]
  • 31.Chen B, et al. Stable classification with limited sample: Transferring a 30-m resolution sample set collected in 2015 to mapping 10-m resolution global land cover in 2017. Sci. Bull. 2019;64:370–373. doi: 10.1016/j.scib.2019.03.002. [DOI] [PubMed] [Google Scholar]
  • 32.Ministry of Housing and Urban-Rural Development of the People’s Republic of China. China urban-rural construction statistical yearbookhttp://www.mohurd.gov.cn/ (2019).
  • 33.National Bureau of Statistics of the People’s Republic of China. Bulletin of the seventh national censushttp://www.stats.gov.cn/tjsj/tjgb/rkpcgb/ (2021).
  • 34.Peng, G. & Lin, Z. A study on gamma correction and its implementation in image processing. Electronic engineer2 (2006).
  • 35.Pizer SM, et al. Adaptive histogram equalization and its variations. Computer vision, graphics, and image processing. 1987;39:355–368. doi: 10.1016/S0734-189X(87)80186-X. [DOI] [Google Scholar]
  • 36.Chen, L. C., Papandreou, G., Schroff, F. & Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv 2017. arXiv preprint arXiv:1706.05587 (2019).
  • 37.Xiong Y, Liu J, Kim J. Understanding differences in thermal comfort between urban and rural residents in hot summer and cold winter climate. Building and environment. 2019;165:106393. doi: 10.1016/j.buildenv.2019.106393. [DOI] [Google Scholar]
  • 38.Burdziakowski P. A Novel Method for the Deblurring of Photogrammetric Images Using Conditional Generative Adversarial Networks. Remote Sensing. 2020;12:2586. doi: 10.3390/rs12162586. [DOI] [Google Scholar]
  • 39.Du S, Du S, Liu B, Zhang X. Incorporating DeepLabv3+ and object-based image analysis for semantic segmentation of very high resolution remote sensing images. International Journal of Digital Earth. 2021;14:357–378. doi: 10.1080/17538947.2020.1831087. [DOI] [Google Scholar]
  • 40.Boudiaf, M. et al. In European Conference on Computer Vision. 548–564 (Springer, 2020).
  • 41.Milletari, F., Navab, N. & Ahmadi, S.-A. In 2016 fourth international conference on 3D vision (3DV). 565–571 (IEEE, 2016).
  • 42.Huang, B. et al. In IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium. 6947–6950 (IEEE, 2018).
  • 43.Lin, T. Y. et al. In European conference on computer vision. 740–755 (Springer, 2014).
  • 44.Chollet, F. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1251–1258 (2017).
  • 45.Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).
  • 46.Loshchilov, I. & Hutter, F. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016).
  • 47.Paszke A, et al. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems. 2019;32:8026–8037. [Google Scholar]
  • 48.Huang, G. et al. Snapshot ensembles: Train 1, get m for free. arXiv preprint arXiv:1704.00109 (2017).
  • 49.Nanjing Normal University Lab of Smart City Sensing and Simulation. 2021. Vectorized rooftop area data for 90 cities in China. National Tibetan Plateau Data Center. [DOI]
  • 50.Environmental Systems Research Institute. ESRI Shapefile Technical Description, an ESRI white paper. (1998).
  • 51.Saalfeld A. Topologically consistent line simplification with the Douglas-Peucker algorithm. Cartography and Geographic Information Science. 1999;26:7–18. doi: 10.1559/152304099782424901. [DOI] [Google Scholar]
  • 52.Hay A. The derivation of global estimates from a confusion matrix. International Journal of Remote Sensing. 1988;9:1395–1398. doi: 10.1080/01431168808954945. [DOI] [Google Scholar]
  • 53.Deng X, Liu Q, Deng Y, Mahadevan S. An improved method to construct basic probability assignment based on the confusion matrix for classification problem. Information Sciences. 2016;340:250–261. doi: 10.1016/j.ins.2016.01.033. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

  1. Nanjing Normal University Lab of Smart City Sensing and Simulation. 2021. Vectorized rooftop area data for 90 cities in China. National Tibetan Plateau Data Center. [DOI]

Data Availability Statement

The procedure of spatial sampling is executed in the ArcGIS Pro platform. The code of the deep learning model is available at https://github.com/ChanceQZ/RoofTopSegmatation. The program is described by Python3, packages of which are Pytroch, Numpy, and OpenCV mainly.


Articles from Scientific Data are provided here courtesy of Nature Publishing Group

RESOURCES