Superpixel-based graph convolutional neural network for polarimetric synthetic aperture radar image classification

Maryam Imani

doi:10.1038/s41598-025-34965-6

. 2026 Jan 4;16:4736. doi: 10.1038/s41598-025-34965-6

Superpixel-based graph convolutional neural network for polarimetric synthetic aperture radar image classification

Maryam Imani ^1,^✉

PMCID: PMC12873321 PMID: 41486311

Abstract

The superpixel-based graph convolutional network with local and global information (SGCN-LG) is introduced for polarimetric synthetic aperture radar (PolSAR) image classification in this paper. The number of superpixels (SPs) is automatically determined according to analyzing the second order differences of the standard deviations of pixels within the square patches with various sizes. The local graph is constructed using neighboring SPs where the number of neighbors for each SP is automatically determined by considering multiple hyper windows around all composing pixels of that SP. Moreover, the nearest neighboring SPs from all classes are chosen from the entire scene to construct the global graph containing the discrimination information and relationship among labeled SPs. The local and global features are fused to achieve the classification map. According to experimental results, the proposed SGCN-LG model outperforms several powerful PolSAR classification models.

Keywords: Graph convolutional network, Deep learning, Polarimetric SAR, Superpixel, Classification

Subject terms: Electrical and electronic engineering, Computational science

Introduction

The polarimetric synthetic aperture radar (PolSAR) images with ability of full-time imagery and providing scattering information through transmission and receipt of electromagnetic waves with various polarizations in different directions¹ are among the best image sources for remote sensing applications such as classification²-³. In the past decades, most of studies have been focused on physical scattering mechanism where various target decomposition methods such as Cloude–Pottier, Freeman, Pauli and Krogager⁴-⁵ have been used for scattering feature extraction. The extracted features can be classified by an appropriate classifier such as Bayesian⁶ or support vector machine (SVM)⁷. Moreover, some previous methods have been focused on the PolSAR statistical distribution where the Wishart distribution was introduced for PolSAR image classification⁸. However, due to complex nature of the imagery scene, the nonlinear relationship of the input image and complexity of textural structures, more efficient polarimetric and spatial features are required for accurate PolSAR image classification^9–11.

Sparse representation with dictionary learning combined with nonlinear transformation is proposed in the nonlinear projection dictionary pair learning (NDPL) method for PolSAR image classification¹². The composite kernel-hybrid discrimination random field (CK-HDRF) method utilizes the advantages of the composite kernels in handling the nonlinearity and high dimensionality beside the discriminative random field for modelling the positive distribution to analyze the complex texture¹³.

Recently, deep learning-based models have shown superior performance in various image processing applications such as PolSAR image classification^14–16. Specially, convolutional neural network (CNN) provides high success for PolSAR image analysis¹⁷-¹⁸. A PolSAR image has a 3D nature with spatial information in two first dimensions and polarimetric information in the third dimension. Therefore, three dimensional CNN with capturing 3D patchers as input can simultaneously extract scattering and spatial features¹⁹. The residual convolutional neural network with autoencoder based attention (RCNN-AA) is introduced in²⁰. It benefits the convolutional autoencoder (CAE) for attending fine features in the PolSAR image. The scaled difference of the original input patch with its approximation obtained by CAE is considered as the attention weight containing information about the fine spatial features. The attention feature maps beside the original ones are fed into the residual CNN. The discriminative features based high confidence classification (DFC) introduced in² uses several approaches to improve the PolSAR image classification. It selects pre-determined convolutional kernels from the important regions of the image without requirement to learning. So, it does not need a high volume of training samples. During a multi-view analysis, diverse classification maps with different information are generated. Moreover, a feature space with reduced dimensionality containing the minimum overlapping and maximum class separability is provided by a two-step discriminant analysis method. Finally, the classification map is generated by a high confidence decision fusion.

The convolutional architectures extract the spatial features from the neighborhood regions. However, they cannot well extract global information from the whole image. To solve this issue, the transformers and attention-based modules have been introduced. They globally extract long-range dependencies and interactions²¹-²². The transformers were initially introduced for natural language processing. Thereafter, the vision transformer (ViT) has been introduced for capturing the global information by using self-attention mechanism in image domain²³. A ViT based PolSAR classifier is introduced in²⁴. Due to presence of objects with different sizes and shapes in natural scenes, there are heterogenous regions in PolSAR images with various contextual information, which can be represented in different levels and scales. To explore this rich source of information, the multi-scale and multi-level attention learning (MMAL) network is introduced in²⁵. It utilizes the cross-attention mechanism to explore the relationships among low-level and high-level features and the relationships among medium-level and high-level features in multiple scales.

The use of high number of polarimetric features in input of deep learning models such as CNN can provide various scattering information. However, the use of high dimensional feature cube is not so effective specially when small training set is available. To solve this issue, an attention based polarimetric feature selection (AFS) convolutional network, called as AFS-CNN is proposed in²⁶, which does feature selection and classification in an end-to-end framework. In addition to CNN and its variants, other forms of deep learning models such as recurrent neural networks have been tried for PolSAR image analysis. For example, in²⁷, the neighborhood regions are converted to spatial sequences. Then, multi-scale spatial features are explored by applying an attention-based multi-scale spatial enhanced long short-term memory (AMSE-LSTM). To extract the pixel-based scattering relationship in PolSAR image, a graph-based complex-valued 3DCNN is used besides the random field with high order cliques in deep features based high order triple discrimination random field (DF-HoTDF) model²⁸.

Segmentation based PolSAR analysis can improve the classification performance. Considering superpixels (SPs) instead of disjoint pixels not only reduces the noise and explores the contextual information but also reduces the computations. The simple linear iteration clustering (SLIC) method is a simple and efficient segmentation algorithm²⁹. More advanced SP generation methods have been introduced for PolSAR image segmentation to well keep details in heterogeneous regions and result in smooth representation in homogeneous regions. An improved version of SLIC is proposed for PolSAR images in³⁰, which adapts the polarimetric characteristics and statistical measures of PolSAR. Moreover, it uses polarimetric feature similarities as statistical distances in its clustering function. The revised Wishart distance is integrated with geodesic distance for PolSAR clustering through a cross-iteration strategy in³¹. In³², a fuzzy SP algorithm is introduced, which uses the correlation between scattering information to cluster pixels. A PolSAR hierarchical energy driven method is introduced in³³ for PolSAR image segmentation. In the coarse level, it uses the histogram intersections of the coherency matrix for SP generation, and in the fine level, it uses the Wishart energy for SP evaluation. In³⁴, the relationship between initial SP size and the structural complexity of PolSAR is constructed beside the determinant ratio test, which lead to a reliable SP generation method with adaptive size estimation.

A composite kernel-based elastic net classifier based on SPs is introduced for PolSAR image classification in³⁵. At first, three types of features are extracted using SP segmentation of different scales. Then, these features are mapped by constructing a composite kernel exploiting the correlation and diversity between different features. Finally, the elastic net classifier is integrated with composite kernel for PolSAR image classification using limited training samples. Advanced methods have recently suggested for SAR image segmentation. For example, in³⁶, both of SP generation and merging steps are incorporated into a unified deep network. At first, a differentiable SP generation method is employed for oversegmentation of the single polarization SAR. Its output is the belonging likelihood of pixels to different SPs. Then, in the merging part, the soft SPs set is converted into a self-connected weighted graph. As an advantage, the shapes of SPs are iteratively adjusted according to the boundaries during training. SPs are introduced into the hypothesis test theory in³⁷ for PolSAR change detection and built-up area extraction. To this end, the PolSAR image is firstly oversegmented into a set of SPs, and the probability density function of a SP’s reflectivity is derived. Then, the superpixelwise likelihood-ratio test statistic is presented to measure similarity of covariance matrices of two superpixelwise for unsupervised change detection.

To deal with small sample size problem in CNN, a dual branch CNN is introduced in³⁸, which uses a SP algorithm for expanding the number of labeled samples. The first branch of CNN extracts polarization features and the second branch extracts spatial features. Moreover, the ensemble learning algorithm is used for the dual branch CNN to improve the classification results. Due to high level feature extraction, deep learning methods may cause edge confusion. To handle this issue, in³⁹, a double channel CNN with an edge preserving Markov random field is proposed. A subnetwork uses the Wishart based complex matrix to learn the statistical characteristics and another subnetwork learns high level semantic features. Although Vision Transformer (ViT) has shown great performance for PolSAR image classification, it requires large labeled samples for training and encounters with semantic misalignment due to fixed patch tokenization. To address these issues, a SP content-aware and semi-supervised ViT network is suggested in⁴⁰. To generate the token sequences, SPs with random sizes are divided into blocks and masked randomly. To implement a semi-supervised ViT, both of supervised and unsupervised learning are integrated.

Graph neural networks are introduced in limited works for PolSAR image classification. In⁴¹, a graph convolutional network is used for neural architecture search. To this end, a graph is constructed whose nodes are pixels of the PolSAR image. It introduces a searching space which its components are from some graph neural networks. To deal with small sample size situation in PolSAR image classification, a graph-based semisupervised deep learning method is proposed in⁴². The PolSAR image is modeled as an undirected graph where labeled and unlabeled pixels are considered as the nodes, and the weighted edges show similarities between pixels. A CNN model is used for polarimetric feature extraction and outputs the class labels to the graph model.

In CNN and many other deep learning models, the input of model is fixed size patches where label is assigned to the central pixel. Considering the relationships among adjacent pixels in neighborhood regions and considering SPs as input of the model may improve the classification map through involving local information and reducing noise pixels. From the other hand, the graph-based networks with aggregating the node features can provide an improved feature representation. Graph convolutional network (GCN)⁴³-⁴⁴ with utilizing nonlocal features and modelling the data structures leads to feature representation enhancement. To benefit advantages of SP based analysis and graph based analysis, several works have integrated these approaches together. A SP-wise segmentation network for single polarization SAR image is introduced in⁴⁵. It firstly uses a differentiable boundary-ware clustering method to estimate task-specific SPs using a simple fully convolutional network. Then, a soft graph convolution network takes the association map and results in the SP-wise segmentation. As an advantage, both of SP generation and graph convolution parts are trained under a unified framework and the shapes of SPs are adjusted according to the segmentation results adhering the boundaries.

The feature enhanced SP hypergraph neural network (FESHNN) is introduced in⁴⁶ for PolSAR image classification, which benefits the advantages of SP-based graph models for extraction of polarimetric correlation and spatial correlation. Its feature discrimination is enhanced with refining the local features contained in pixels and SPs. However, this method ignores the global information contained in class labels through the scene. Efficiency of the SP-based GCN method is substantially dependent on the SP segmentation result effected by speckle noise and scattering confusion. To deal with this difficulty, a hybrid weighted fuzzy SP-based GCN method is introduced in⁴⁷, which corrects the edge pixels through defining a fuzzy projection matrix. The features are transformed from SP to pixel where the edge pixels’ features are computed from all neighboring SPs to refine edges to the most similar region. Both of the multifeature distances and revised Wishart are used to define the hybrid weighted adjacent matrix. This method disregards the local individual features of pixels and captures the global contextual information. For local and global feature combining, the graph network is integrated with a 3DCNN into a unified framework.

As said, most of existing PolSAR classification methods take fixed patches as input and utilize CNN for feature extraction and classification where only local information is utilized. The polarimetric information in PolSAR images is complex, and the use of only local information may be not sufficient for providing an accurate classification map. In contrast to pixel-based methods, approaches such as graph neural networks, which consider irregular SPs as input and update features through the graph structure based on information of adjacent nodes learn global information beside the local one, and so, improve the PolSAR image classification. However, the scale of SPs affects the classification results due to existence of objects with various shapes and sizes. To handle this issue, a multiscale SP guided weighted graph convolutional network is proposed in⁴⁸. At first, it segments the PolSAR image into SPs of three different scales. Then, the correlation among SPs is used to form the adjacency matrix. Then, the weighted graph convolutional network is utilized for providing the SP feature representation. Finally, a multiscale feature cascade fusion module is introduced for providing the pixel level features.

Because of some reasons, the SP-based analysis of a PolSAR image can be preferred than the pixel-based analysis: (1) due to the noisy inherent of SAR images, the SP representation of a PolSAR image is more appropriate than its pixel representation because SPs remove the noisy pixels; (2) the SP representation implicitly explores the spatial information of the PolSAR images; and (3) the use of SPs instead of pixels degrades the computational burden. Usually, the appropriate number of SPs is set manually. In⁴⁹, a formula is presented for computing the SP segmentation scale for a hyperspectral image. In⁴⁹, the inherent properties of hyperspectral images such as spatial size, texture ratio, spatial resolution, and the number of categories are taken to account to compute the number of SPs. However, selection of an appropriate number of SPs in a PolSAR image needs to trial and error that is a troublesome task. From the other hand, limited works have studied the ability of graph convolutional networks for PolSAR image classification. Because of ability of graphs in exploring the hidden relationships among defined nodes, they are great tools for feature extraction in complex feature spaces. Moreover, graphs with providing a total view on all nodes, can explore the global information.

Although different works have shown great success in providing accurate classification maps with utilizing the advantages of SPs and graphs but they mostly utilize the SP based graph structure for global feature extraction. Some works have integrated graph networks with CNN to combine global and local features. However, exploring the irregular local structure is not possible by applying a convolutional network. To address this issue, a SP based network is designed in this work, which utilizes the graph networks for exploring both local and global features. A local graph from neighborhood SPs, and a global graph from the nearest SPs of different classes through the PolSAR image are composed. The local and global graphs are individually analyzed using the graph convolutional networks, and then, their extracted features are fused and used for PolSAR image classification. Moreover, most of SP based networks have difficulty of determining the number of SPs through trial and error. This issue is also addressed in this work through introducing an automatic method for determining the number of SPs. In addition, the most of graph-based networks usually have high number of learnable parameters and are highly complicated. A simple dual graph convolutional network with low number of learnable parameters is proposed in this work, which is simply implemented and runs fast in the prediction (test) phase.

To improve the PolSAR image classification, a SP-based graph convolutional network (SGCN) is introduced here, which consists of two branches. While the first branch extracts local spatial features through the local graph constructed from unlabeled neighboring SPs, the second branch extracts global information through the global graph constructed from the nearest labeled SPs from all classes. The main contributions of this work are represented as follows:

A graph convolutional network with two branches containing the local and global information is constructed for PolSAR image classification.
The local graph provides the local neighborhood information and also the unlabeled samples structures.
The global graph contains the global class information and the labeled samples structures.
The number of SPs is determined automatically by computing the standard deviation vector and its second order differential vector.
The number of neighboring SPs is automatically determined by considering multiple local windows around all composing pixels of each SP.

The SGCN model is assessed with an ablation study where just the local branch is used (SGCN-L), where just the global branch is used (SGCN-G), and where both local and global branches are fused together (SGCN-LG). The experimental results show superior performance of the proposed SGCN-LG model in different PolSAR images. Comparison with several state-of-the-art methods shows that SGCN-LG outperforms its competitors with using lower number of training samples.

Proposed SGCN-LG model

In this work, the superpixel based graph convolutional network (SGCN) for local and global feature fusion, called as SGCN-LG, is proposed for PolSAR image classification. The proposed SGCN model consists of two branches of local graph and global graph where each graph is composed of superpixels (SPs) as the graph nodes. The local and global features containing the data structures of spatial neighbors and class neighbors, respectively, are eventually fused to classify the input SP. The flowchart of the SGCN-LG framework and the proposed network are shown in Figs. 1 and 2, respectively.

Fig. 1 — Flowchart of the proposed SGCN-LG framework.

Fig. 2 — The proposed network in the SGCN-LG model.

As said, SGCN-LG contains two parts of local (SGCN-L) and global (SGCN-G), which are explained with more details in the following. But, before that, generation of SPs with automatic determining the number of SPs is described.

Superpixels generation with automatic determining the number of superpixels

The proposed graph model uses the SPs of the PolSAR image as nodes. The SLIC algorithm is used here for generation of SPs because it is a well-known method with simple implementation. The SLIC algorithm is applied to the first principal component (PC1) obtained by the principal component analysis (PCA) transform⁵⁰, which is normalized by:

where Inline graphic and computes the minimum and maximum values among all pixels of the PC1 image. The SLIC algorithm is applied to the PC1, which contains the polarimetric components with the most energy. So, it considers the polarimetric information of the PolSAR image.

In this algorithm, the number of SPs denoted as Inline graphic is given as input, which is a user defined parameter. Although the appropriate value of can be determined for each dataset by using the experiments, a simple method is proposed in this work for automatic determination of SPs, which provides appropriate results for various PolSAR images.

The number of SPs in an image can be approximated with the number of the considered square patches in it. With considering Inline graphic patches, each patch contains pixels. Let be the number of total pixels in the image. So, we have where rounds the input argument. To find the appropriate patch size , 25 odd numbers in range of^3,35 are considered as the patch size candidates in a vector where means range of^3,35 with step 2, and Inline graphic is the transpose operation. For each considered patch size , the PolSAR image in each polarimetric channel is divided to patches where denotes the number of polarimetric channels. Standard deviation (std) of pixels in each patch is computed. The average of std of all patches in all polarimetric channels is computed associated with each patch size as follows:

where Inline graphic is th patch generated in th channel associated with patch size . computes the standard deviation of pixels within the input patch. With computing the std value for all assumed patch sizes, the vector is obtained. The difference operation is applied to the . For a vector with length Inline graphic , the difference operation calculates the differences between adjacent elements of vector as follows:

To find the first order differences of Inline graphic , we compute:

With applying the difference operation to Inline graphic , the second order differences are also obtained by:

The output dimension of the difference operation is equal to dimension of the input vector minus one. So, because dimension of Inline graphic is 25, dimensions of and are 24 and 23, respectively. The vector and its first and second order differences versus the patch size are shown for the Sanfrancisco image in Fig. 3. Because the length of is two units less than length of , it can be seen that the x-axis for plot in Fig. 3(a) starts from 3 while the x-axis for Inline graphic and starts from 5 to 7, respectively in Fig. 3(b) and Fig. 3(c).

Fig. 3 — The (a) values, (b) the first order differences, and (c) the second order differences versus the patch size.

As seen, with increasing the patch size, generally, the std value is increasing, Inline graphic is decreasing, and the associated results in negative values. This is expected because with increasing the patch size, more pixels different from the central pixel may be located in the patch, which lead to the increasing of variance. Moreover, with increasing the patch size, the changes values of variance, which correspond to differential of std values, are decreased to a point that patch contains the related and similar pixels. In other words, with increasing the patch size from a place to next, the change variations start to increase, which show the patch may contain the non-related pixels with respect to the central. Therefore, the first place that Inline graphic starts to increase or becomes positive, can show the appropriate patch size:

where vector Inline graphic shows the indices associated with positive values in . The first index of , i.e., , corresponds to the first place that is positive. The appropriate patch size corresponds to index in the patch size vector , i.e., . Because considering the first index, , leads to selection of a relatively small patch size, and so, selection of high number of SPs, which not only increases the graph computations but also does not provide high accurate results, instead of Inline graphic , the use of is suggested for determining the appropriate patch size. In other words, instead of considering the first place that becomes positive, the third place that becomes positive, i.e., , is considered, which leads to selection of larger patch size, and so, lower number of SPs. For example, in Fig. 3, we can see that the third place that Inline graphic results positive value in Fig. 3(c), associated with the third place that starts to increase in Fig. 3(b), corresponds to patch size in Fig. 3(a).

The number of pixels in each SP is approximated by Inline graphic where is the appropriate patch size selected from the patch size vector and indices are determined by (6), and is the third element of the vector . is finally obtained to use as input of the SLIC segmentation algorithm.

To process an image with homogeneous regions, the image should be partitioned to a smaller number of SPs where each SP covers a relatively large homogeneous region with high number of similar pixels. In contrast, to process an image with heterogeneous regions, the image should be partitioned to a larger number of SPs with small areas where each SP covers a small region containing a low number of similar pixels.

In the images with homogeneous regions, the adjacent pixels considered within a patch have smaller variance value. So, Inline graphic starts to increase later, and therefore, becomes positive later. In other words, a larger value of patch size from the patch size vector is selected, which is associated with larger number of pixels in each SP, i.e., , and lower number of SPs . In contrast, In the images with heterogeneous regions, there is high variations in image. So, Inline graphic starts to increase earlier, and therefore, becomes positive earlier. Thus, a smaller value of patch size is selected, which is associated with smaller and larger .

According to above method, the number of SPs for three datasets, which will be introduced in Sect. 3, are obtained as follows: Inline graphic in Flevoland image, in Sanfrancisco image, and in Oberpfaffenhofen image. The generated SP maps for three datasets are shown in Fig. 4.

Fig. 4 — SP maps of Flevoland, Sanfrancisco and Oberpfaffenhofen images.

Graph construction

An undirected graph Inline graphic with as the vertex set containing nodes and as the edge set is constructed where represents the edge among nodes and , i.e., the edge . The adjacency matrix is composed on the edge set as follows:

where Inline graphic as the maximum operator finds the maximum value from the input elements, and is the feature vector associated with node where is dimensionality of the feature vector. The diagonal degree matrix is constructed as . With considering as the identity matrix with dimensions of , is the adjacency matrix with added self-connections and Inline graphic is the degree matrix of . The normalized adjacency matrix will be . Considering the architecture introduced for graph convolution operation⁵¹-⁵², we have:

where Inline graphic is the activation function that is the rectified linear unit (ReLu) here. denotes the feature matrix obtained after layers and where is the input feature matrix. is the trainable weight matrix for multiplication in layer . To implement the trainable weight matrix for multiplication, the elementwise production followed by a 2D convolutional operator with 1 filter from size of Inline graphic is applied in this work (see Fig. 2).

SGCN-L

In the SGCN-L model, a graph is constructed on a given SP and its Inline graphic spatial neighbors. After segmentation of the PolSAR image in previous section, a local graph is constructed from each SP and its adjacent SPs. Assume that a superpixel contains pixels. Around each pixel of the SP, a local window is considered that

where the appropriate size of Inline graphic can be obtained through experiments (it is discussed in Sect. 3.2). The value is approximation of the number of pixels in a SP. For considering a super window containing SPs, which consists of about pixels, the length of the square window will be about the square root of . For each pixel of the SP, a Inline graphic window is constituted and label (number) of the SP that each pixel of this window belongs to it is saved. So, for superpixel that has pixels, the central pixel 1 and its neighbors in its local window belongs to SPs. Similarly, the neighbors of pixel 2 in its neighborhood window belongs to Inline graphic SPs, and eventually, the neighbors of pixel in its neighborhood window belongs to SPs. Because the adjacent pixels may belong to the same SP, the unique SPs is countered. So, are the number of unique SPs in pixels 1, 2,…, , respectively. The minimum number among is obtained by:

So, Inline graphic neighbors is determined for superpixel . This process is repeated for all SPs in the image and neighbors is selected for th SP. Minimum of the obtained numbers is finally considered as :

Eventually, for each SP, Inline graphic SPs that are located in local window of its composing pixels are selected as local neighboring SPs. After here to next, for simplicity in notations, is written instead of . According to what explained, the number of neighboring SPs for three datasets are obtained as follows: in Flevoland, Inline graphic in Sanfrancisco, and in Oberpfaffenhofen image.

Now, for each SP, a local graph with Inline graphic nodes is made where the nodes are the adjacent SPs that are neighbors of the given SP. The feature matrix for each SP is:

where Inline graphic is the mean of pixels that belong to superpixel , and is the number of polarimetric channels. For each pixel of the PolSAR image, 9 elements of the coherency matrix are used as the feature vector as follows:

So, we have Inline graphic . Although there are many target decomposition methods that can explore the polarimetric scattering features, we do not use them because of some reasons: 1- for simplicity and avoiding more computations, 2- because the PC1 of features is used for generation of SPs, the use of polarimetric features of the coherency matrix may be sufficient.

Also, the local adjacency matrix is denoted by Inline graphic . For each SP of the PolSAR image, the local feature matrix and the local adjacency matrix are used to compose the local graph convolutional model according to what described in the “2.2. graph construction” section. The local graph has two inputs: and . Because, dimensionality of the inputs must be fixed in the network, a fixed neighborhood size Inline graphic has to be considered for all SPs.

SGCN-G

Assume that there are labeled samples from Inline graphic classes of dataset. For each labeled pixel, the SP that belongs to it is considered as the labeled SP (training sample) where label of the given pixel is assigned to the SP. For each SP, the nearest SP from each given class is selected. nearest neighbors from classes are used as the nodes to form the global graph for the given SP. The mean of pixels in each SP is used as the representative feature vector of that SP and the Euclidean distance is considered for computing the nearest SPs.

Because the training samples are globally located in entire the scene, the composed graph contains the global information with features from labeled samples of all classes. For each SP of the PolSAR image, the global feature matrix Inline graphic and the global adjacency matrix are used to compose the global graph convolutional model according to what described in the “2.2. graph construction” section.

Feature fusion and classification

Outputs of two branches of model are individually flattened and then, fused together through the concatenation layer. The aim in this work is to provide a simple and light network with relatively low number of parameters to provide efficient classification results even in small sample size situations. So, fusion of local and global branches is simply done by concatenating the extracted features. Finally, a fully connected (FC) layer with Inline graphic neurons, softmax and classification layer are used to find the label of the input SP as the model’s output. In this model, for each SP, the local and global feature matrices and adjacency matrices are given as the input and label of the given SP is obtained as the output. The proposed model is a SP based classification. The label of each SP is assigned to all pixels that compose it.

The classification map is filtered by the guided filter with the first principal component as the guidance image to provide a classification map aligned with the real class boundaries. The guided filter has two free parameters, which are set in the experiments: Inline graphic determines the length of the filtering window, , and is the regularization parameter. For more details about the guided filter, the interested reader is referred to⁵³.

Experiments

Datasets and parameter settings

Three real L-band PolSAR images are used for experiments. The first dataset is Flevoland acquired by AIRSAR, which contains 15 classes with 750 Inline graphic 1024 pixels. The second image acquired by AIRSAR is Sanfrancisco Bay image with 5 classes and 9001024 pixels. The third dataset acquired by electronically steered array radar (ESAR) is Oberpfaffenhofen with 4 classes and 1297935 pixels.

In the Flevoland image, 100 training samples (labeled pixels) per class and in two other datasets, 500 training samples per class are used. For applying the guided filter in the end of the proposed method, the following settings are set: Inline graphic and in Flevoland, and in Sanfrancisco, and and in Oberpfaffenhofen dataset. For pixel based classification methods, the use of smaller window sizes in the guided filter is recommended. But, because the proposed method is a superpixel based classification, there are larger homogenous regions in the obtained classification maps, and so, larger guided filters are applied to them. The Adam optimizer with initial learning rate of 0.001, batch size of 50 and 200 epochs are used for training the proposed methods.

The proposed framework is assessed in three cases: when just local graph is used, i.e., SGCN-L, when just global graph is used, i.e., SGCN-G, and when features of both local and global graphs are fused, i.e., SGCN-LG. These proposed models are compared with SVM, two-dimensional CNN (2DCNN), three-dimensional CNN (3DCNN), and some state-of-the-art PolSAR classification methods. SVM is assessed in two different cases where pixels or SPs are used as input of the classifier, i.e., SVM (pixel) and SVM (superpixel). For implementation of SVM, the polynomial kernel with degree 3 is used.

Due to the use of relatively small training sets, low depth 2DCNN and 3DCNN networks are used as the competitors. In 2DCNN, two convolutional layers followed by batch normalization, ReLu, and dropout with dropping probability of 0.2 are used. 4 filters with size of Inline graphic and stride 2 are used in the first and second convolutional layers, respectively, with the “same padding”. In the 3DCNN, the similar settings are used where convolutional filters are used instead of filters. In the end of 2DCNN and 3DCNN models, two fully connected layers with and Inline graphic neurons, respectively are used, which are followed by softmax and classification layers. The inputs of 2DCNN and 3DCNN is patches with size of where .

Assessment of parameters’ effects

The effect of selection of Inline graphic , and in classification accuracy and prediction time in Flevoland dataset is represented in Table 1. As said before, the use of and leads to determination of a larger number of SPs, which cause higher computational burden. Especially in the global graph, i.e., SGCN-G, the effect of computational burden is more than SGCN-L because in the SGCN-G, associated with each given SP, we have to find the nearest SPs from each class through computing the Euclidean distance. As seen, the prediction time of SGCN-G (containing global graph) and SGCN-LG (containing both local and global graphs) with Inline graphic is about half of the prediction time of them with and . From the classification accuracy point of view, in the SGCN-L model, the classification accuracy obtained by is significantly better than that with and . In the SGCN-G model, from to , the classification accuracy is decreased. With Inline graphic , smaller number of SPs are generated in the PolSAR image, which may be not sufficient for accurate finding the nearest neighbors from different classes. In the SGCN-LG, classification accuracies obtained by to , with a bit difference with respect to each other, are better than . Generally, for the main proposed method, i.e., SGCN-LG, Inline graphic can be the best choice compared to and in terms of both classification accuracy and prediction time.

Table 1.

Effect of selection of Inline graphic , and in classification accuracy and prediction time.

No. of superpixels	No. of training samples per class	SGCN-L	SGCN-G	SGCN-LG
3413	Overall accuracy	90.82	96.08	98.44
3413	Prediction time (seconds)	1.25	1.10	1.53
2657	Overall accuracy	88.82	94.64	99.35
2657	Prediction time (seconds)	1.27	1.01	1.53
1229	Overall accuracy	95.64	92.15	99.26
1229	Prediction time (seconds)	1.24	0.55	0.73

Open in a new tab

The number of considered SPs should be large enough to fit the heterogeneous regions such that the non-similar pixels are assigned to different SPs. From the other hand, increasing the number of SPs leads to increasing the computational burden.

According to Table 1, for three different numbers of SPs in Flevoland dataset, i.e., Inline graphic , , and , the classification accuracy and prediction time are obtained for different models of the proposed framework. The experiments show that is an appropriate number of SPs for Flevoland dataset in terms of both classification accuracy and prediction time.

In Table 2, effect of selection of the parameter Inline graphic is assessed for Flevoland dataset. For different values of , the associated value of local window size computed according (9), and the number of local neighboring SPs obtained by (11) is given. Corresponding to each parameter , the overall accuracy achieved by the local branch of the proposed model, i.e., SGCN-L is represented. Moreover, the running time for providing the neighbors to construct the local graph is given. As seen from this table, selection of a larger value for parameter Inline graphic leads to considering a larger local neighborhood window , and selection of more neighboring SPs , which requires more running time. However, although increasing the local window size with involving more spatial information from the local regions increases the classification accuracy to a point, but from a point to next, increasing the local window includes redundant and non-related spatial information, which may degrade the class discrimination ability. As seen from this table, with increasing the local window size to Inline graphic , the classification accuracy is improved but after that, the OA is decreased. With more increasing the local window, not only the classification accuracy is decreased but also the computation time is increased. According to what discussed, the parameter is set as for all PolSAR images in this work. Generally, with increasing the local window size, higher running time is required to find the neighboring SPs. However, the related computations are done in the training phase and do not cause delay in the test (prediction) phase.

Table 2.

Effect of selection of parameter Inline graphic in providing the neighboring superpixels.

No. of neighborhood superpixels	Local window size		Computation time (seconds)	OA of SGCN-L
3	45.00	6	110.62	79.78
5	57.00	8	167.77	88.57
7	67.00	10	210.25	90.92
9	77.00	13	265.65	95.64
11	85.00	15	307.15	93.07

Open in a new tab

The whole network is trained in a unified framework with an end-to-end supervised manner. In other words, the learnable parameters of both local and global graphs are determined with a supervised learning, and so, efficiencies of both of them are affected by the number of training samples. The inputs of local graph are Inline graphic and ; and the inputs of global graph are and where is the number of polarimetric features, is the number of local neighboring SPs (for example, in Flevoland, we obtain ), and is the number of classes ( in Flevoland). Generally, sizes of the input data, i.e., sizes of feature matrix and adjacency matrix, in the constructed local and graph graphs are not so large. So, the proposed models SGCN-L, SGCN-G and SGCN-LG are yet relatively efficient using limited training samples.

In Table 3, the overall accuracies obtained by different cases of the proposed framework for Flevoland dataset are reported for different number of training samples. As seen, when, the number of training samples is low (10 or 50 training samples per class), the SGCN-G works better than SGCN-L. By using 10 training samples per class, SGCN-G ranks first with significant difference with respect to SGCN-L and SGCN-LG. The main proposed method, SGCN-LG, even with low number of labeled samples (50 training samples per class) achieves high overall accuracy (OA). Although with increasing the number of labeled samples, efficiency of SGCN-L and SGCN-G are significantly improved, but, SGCN-LG, which fuses the information from both SGCN-L and SGCN-G has lower sensitivity to the number of training samples where its difference between two cases of 100 training samples per class and 150 training samples per class is not significant.

Table 3.

Overall accuracy obtained in different sizes of training set.

No. of training samples per class	SGCN-L	SGCN-G	SGCN-LG
10	48.86	78.51	67.15
50	68.49	86.09	95.97
100	95.64	92.15	99.26
150	98.68	94.82	99.29

Open in a new tab

Classification results

In Table 4, the classification results obtained for Flevoland dataset are reported. The classification accuracy of each class, average accuracy (AA), overall accuracy (OA), and kappa coefficient (K) are represented. As seen, the proposed SGCN-LG model provides the highest AA, OA, K and Macro-F1 values. After that, SVM (superpixel) ranks second. SGCN-L, 3DCNN, SGCN-G, 2DCNN, and SVM (pixel) are located in the next ranks, respectively. Most of classes in Flevoland image are related to agriculture regions with grained texture and varied polarimetric characteristics. So, in the agriculture classes such as “Lucerne”, “Beet”, “Grass”, “Rapeseed” and “Wheat 3”, SGCN-L, which explores local information from the neighborhood context performs better than SGCN-G, which focuses on global feature extraction. In contrast, in the class of “Buildings” which has coarser texture, the SGCN-G network outperforms the SGCN-L. Generally, in this dataset, SGCN-L results in more accurate classification map with respect to SGCN-G, which shows more importance of local features contained in neighborhood SPs than the global class information in this image. 100 samples in each category (totally 1500 samples) are used as training set, which is relatively a small set. SVM, which has low sensitivity to the number of training samples provides high accurate results when it is implemented SP based where the use of SPs effectively improves its performance. Although a low depth is considered for 2DCNN and 3DCNN architectures, yet it seems that the used small training set is not enough for good learning of their models.

Table 4.

Classification results for the flevoland dataset.

Name of class	# samples	SGCN-L	SGCN-G	SGCN-LG	SVM (pixel)	SVM (superpixel)	2DCNN	3DCNN
Stembeans	6103	99.57	99.25	99.41	96.00	98.31	97.20	97.64
Peas	9111	100.00	100.00	99.95	96.53	99.73	97.32	96.90
Forest	14,944	93.75	92.92	99.59	86.32	98.32	90.20	97.42
Lucerne	9477	97.30	82.57	94.68	91.09	97.38	95.42	95.28
Wheat	17,283	98.86	97.18	99.86	78.67	98.76	83.93	92.89
Beet	10,050	90.88	84.95	98.64	90.11	97.25	94.44	95.26
Potatoes	15,292	76.49	89.61	98.11	75.22	91.90	88.27	86.86
Bare soil	3078	100.00	100.00	100.00	99.71	100.00	100.00	100.00
Grass	6269	99.79	89.26	99.98	79.01	98.37	78.96	88.47
Rapeseed	12,690	95.55	84.04	99.76	80.64	98.12	85.58	92.41
Barely	7156	100.00	99.85	100.00	93.00	99.68	99.34	96.12
Wheat 2	10,591	99.87	99.79	99.87	79.09	98.22	92.81	88.68
Wheat 3	21,300	99.83	85.54	99.97	85.82	98.70	94.02	95.63
Water	13,476	98.26	98.66	99.90	96.96	99.13	98.81	97.27
Buildings	476	85.08	97.27	90.55	94.75	97.90	90.97	95.38
AA		95.68	93.39	98.68	88.19	98.12	92.49	94.42
OA		95.64	92.15	99.26	86.10	97.89	91.81	93.99
K		95.24	91.45	99.19	84.86	97.70	91.07	93.44
Macro-F1		95.47	88.62	97.43	85.85	97.97	91.79	93.97

Open in a new tab

The proposed SGCN model, which is a graph constructed on the SPs, can well learn specially when both local and global information are fused. To assess whether the difference of the classification methods is statistically significant or not, the Z scores are computed according to the McNemars test⁵⁴, and the results are shown in Table 5. It is seen that SGCN-LG provides positive Z values much larger than 1.96 with respect to other methods, which shows superior performance of SGCN-LG compared to other methods from the statistical point of view. The Pauli RGB, ground truth map (GTM), and the classification maps of the Flevoland dataset are shown in Fig. 5.

Table 5.

McNemars test results for the flevoland dataset.

	SGCN-L	SGCN-G	SGCN-LG	SVM (pixel)	SVM (superpixel)	2DCNN	3DCNN
SGCN-L	0	42.84	− 70.48	94.02	− 39.13	45.14	21.29
SGCN-G	− 42.84	0	− 103.48	56.73	− 76.39	3.65	− 21.63
SGCN-LG	70.48	103.48	0	138.69	35.14	101.38	83.29
SVM (pixel)	− 94.02	− 56.73	− 138.69	0	− 122.74	− 61.67	− 88.71
SVM (superpixel)	39.13	76.39	− 35.14	122.74	0	78.90	57.30
2DCNN	− 45.14	− 3.65	− 101.38	61.67	− 78.90	0	− 32.96
3DCNN	− 21.29	21.63	− 83.29	88.71	− 57.30	32.96	0

Open in a new tab

Fig. 5 — Classification maps for the Flevoland dataset.

As seen, SGCN-L, SGCN-G, SGCN-LG, and SVM (superpixel), which use the SPs as input provide cleaner classification maps with respect to SVM (pixel), 2DCNN and 3DCNN. Although in 2DCNN and 3DCNN, the patches are used as input, but a patch is representative of its central pixel where the label in the output is assigned to the center.

The classification accuracies and Z scores for Sanfrancisco dataset are reported in Tables 6 and 7, respectively. The Sanfrancisco image has classes with large and approximately uniform texture. So, difference between SGCN-L and SGCN-G is not so significant in most of classes of this dataset. However, in the class of “Ocean”, which is inherently.

Table 6.

Classification results for the Sanfrancisco dataset.

Name of class	# samples	SGCN-L	SGCN-G	SGCN-LG	SVM (pixel)	SVM (superpixel)	2DCNN	3DCNN
Bare soil	15,628	66.80	67.19	59.85	39.67	70.62	83.89	85.95
Mountain	63,295	93.67	91.07	92.86	61.54	85.76	90.91	92.89
Ocean	328,118	82.36	97.69	98.29	81.26	61.95	95.90	95.45
Urban	343,465	94.17	96.18	98.68	29.37	84.86	89.90	87.45
Vegetation	54,758	79.06	83.15	87.89	33.84	45.92	82.36	83.79
AA		83.21	87.06	87.51	49.14	69.82	88.59	89.11
OA		87.76	94.95	96.58	53.55	72.67	91.79	90.86
K		81.71	92.12	94.63	38.89	61.84	87.54	86.24
Macro-F1		75.10	87.09	89.95	44.10	58.03	81.76	80.57

Open in a new tab

Table 7.

McNemars test results for the Sanfrancisco dataset.

	SGCN-L	SGCN-G	SGCN-LG	SVM (pixel)	SVM (superpixel)	2DCNN	3DCNN
SGCN-L	0	− 190.34	− 243.33	443.93	248.62	− 90.72	− 68.85
SGCN-G	190.34	0	− 87.79	551.39	399.39	107.22	129.59
SGCN-LG	243.33	87.79	0	573.08	421.36	152.64	171.20
SVM (pixel)	− 443.93	− 551.39	− 573.08	0	− 237.37	− 513.21	− 502.50
SVM (superpixel)	− 248.62	− 399.39	− 421.36	237.37	0	− 332.58	− 312.76
2DCNN	90.72	− 107.22	− 152.64	513.21	332.58	0	48.29
3DCNN	68.85	− 129.59	− 171.20	502.50	312.76	− 48.29	0

Open in a new tab

a homogeneous region, the use of SGCN-G significantly results in better classification result compared to SGCN-L. In this dataset, SGCN-G significantly works better than SGCN-L according to AA, OA, K, Macro-F1 and Z scores in the McNemars test. It shows that the global features contained in the nearest labeled SPs from each class has much more discrimination information with respect to the unlabeled neighboring SPs in the local regions in this image. The proposed SGCN-LG model with considering both local features in unlabeled samples and global ones in labeled samples provides the best classification results. SVM in both cases of pixel-based and superpixel-based cannot work well. 2DCNN and 3DCNN achieve close results where AA of 3DCNN is higher than that of 2DCNN but OA, K and Macro-F1 of 2DCNN is better. Note that due to the high number of learnable parameters in 3DCNN with respect to 2DCNN, 2DCNN may work better than 3DCNN if the training set is not large enough. In this dataset, SGCN-LG, SGCN-G and 2DCNN are the best candidates. The classification maps for Sanfrancisco dataset are shown in Fig. 6. Although SVM (superpixel) uses the SPs as input, but it fails to align the real class boundaries, and there are high false alarm regions in its classification map. Moreover, in SGCN-L, the class with blue color in GTM (Bare soil) is assigned wrongly to a large area of the class with green color (Ocean). SVM (pixel) fails to work. Although 2DCNN and 3DCNN provides relatively accurate results, but their classification maps are very noisy.

Fig. 6 — Classification maps for the Sanfrancisco dataset.

The classification results and the Z score values for Oberpfaffenhofen dataset are represented in Tables 8 and 9, respectively. In homogeneous classes such as “Open areas” and “Wood land”, SGCN-G outperforms SGCN-L, and in classes with more contextual details such as “Built-up areas”, SGCN-L significantly works better than SGCN-G. In this dataset, generally SGCN-G works better than SGCN-L, and SGCN-LG ranks first with significant difference with respect to other methods from the statistical point of view. SVM in both cases of pixel-based and superpixel-based provides OA less than 50%, which is not acceptable. 3DCNN and SGCN-G ranks second and third, respectively. The classification maps are shown in Fig. 7. It can be seen that the class with yellow color in GTM (Built-up areas) is not well detected in SVM (pixel). There are lots of false alarm regions in SVM (superpixel). 2DCNN and 3DCNN provide high noisy classification maps. In the SGCN-L, there are lots of false alarm regions. In contrast, SGCN-LG and SGCN-G provide more accurate and cleaner classification maps. Although SGCN-LG and SGCN-G provide higher classification accuracy with respect to other models, however, they are superpixel-based methods, which have not ability of preserving the edges and class boundaries as the same as pixel-level methods.

Table 8.

Classification results for the Oberpfaffenhofen dataset.

Name of class	# Total samples	SGCN-L	SGCN-G	SGCN-LG	SVM (pixel)	SVM (superpixel)	2DCNN	3DCNN
Open areas	625,029	48.70	68.95	70.48	32.78	65.14	72.92	73.20
Wood land	202,032	89.02	92.92	89.79	74.45	11.13	70.27	72.36
Built-up areas	190,202	67.30	33.08	71.79	14.49	27.10	38.10	43.45
Road	195,432	26.66	29.15	39.37	29.41	22.61	26.52	28.77
AA		57.92	56.02	67.85	37.78	31.49	51.95	54.44
OA		54.78	60.90	68.89	36.31	43.32	59.54	61.23
K		38.58	43.25	55.04	14.83	16.08	39.86	42.61
Macro-F1		50.81	52.13	63.94	33.13	31.22	50.73	52.89

Open in a new tab

Table 9.

McNemars test results for the Oberpfaffenhofen dataset.

	SGCN-L	SGCN-G	SGCN-LG	SVM (pixel)	SVM (superpixel)	2DCNN	3DCNN
SGCN-L	0	− 116.47	− 309.93	298.86	172.32	− 81.38	− 111.56
SGCN-G	116.47	0	− 214.26	405.36	287.54	27.13	− 6.66
SGCN-LG	309.93	214.26	0	508.32	400.36	180.26	150.89
SVM (pixel)	− 298.86	− 405.36	− 508.32	0	− 105.65	− 399.15	− 420.41
SVM (superpixel)	− 172.32	− 287.54	− 400.36	105.65	0	− 266.46	− 293.75
2DCNN	81.38	− 27.13	− 180.26	399.15	266.46	0	− 44.73
3DCNN	111.56	6.66	− 150.89	420.41	293.75	44.73	0

Open in a new tab

Fig. 7 — Classification maps for the Oberpfaffenhofen dataset.

To assess impact of the guided filtering in enhancement of the proposed network and its branches, the OA values obtained in both cases of without guided filtering and with guided filtering for all datasets are reported in Table 10. According to the obtained results, because of noise removing with preserving the class boundaries according to the first principal component as the guidance image, the guided filtering increases the OA in all cases. This improvement is significant for SGCN-L in Sanfrancisco and Oberpfaffenhofen datasets.

Table 10.

The OA in both cases of without applying the guided filter and with applying the guided filter.

Name of class	SGCN-L		SGCN-G		SGCN-LG
Name of class	Without guided filtering	With guided filtering	Without guided filtering	With guided filtering	Without guided filtering	With guided filtering
Flevoland	93.21	95.64	90.36	92.15	98.15	99.26
Sanfrancisco	77.15	87.76	92.26	94.95	93.44	96.58
Oberpfaffenhofen	50.00	54.78	58.15	60.90	65.45	68.89

Open in a new tab

In this paper, the image is partitioned into SPs. Then, for each given SP, two small graphs are constructed, which lead to low computations. The feature matrix and adjacency matrices of the local graph are Inline graphic and ; and the feature matrix and adjacency matrices of the global graph are and where is the number of polarimetric features, is the number of local neighboring SPs (for example, in Flevoland, we obtain ), and is the number of classes ( in Flevoland). Generally, the sizes of constructed graphs are relatively small, and so, there is not high computations for their processing.

In Table 11, the number of learnable parameters for each method is reported. As seen, all cases of the proposed framework, i.e., SGCN-L, SGCN-G and SGCN-LG have low number of learnable parameters even smaller than the considered light 2DCNN. Approximately, each of SGCN-L and SGCN-G contains half of the learnable parameters in SGCN-LG. In Table 12, the running time in both training and test (prediction) phases are reported for different methods in the Flevoland dataset. According to the obtained results, although SGCN-L, SGCN-G and SGCN-LG models have high training time compared to other methods, but they run much faster than 2DCNN and 3DCNN, and even, faster the pixel based SVM in the test phase. Although the superpixel based SVM has the lowest running time, but it is not so efficient in various PolSAR images compared to other methods.

Table 11.

The number of learnable parameters in each model.

Method	SGCN-L	SGCN-G	SGCN-LG	2DCNN	3DCNN
No. of learnable parameters	1.8k	2k	3.8k	5.9k	23.5k

Open in a new tab

Table 12.

The running time in each model.

Method	SGCN-L	SGCN-G	SGCN-LG	SVM (pixel)	SVM (superpixel)	2DCNN	3DCNN
Training time (seconds)	1804.58	1753.27	3050.37	3.04	0.20	19.59	18.74
Test time (seconds)	1.24	0.55	0.73	3.60	0.02	72.62	74.25

Open in a new tab

Generally, the proposed framework with a low number of learnable parameters, fast running in the prediction phase, and high efficiency in different images, can be a good candidate for PolSAR image classification.

Comparison with several state-of-the-art methods

The proposed SGCN-LG model is compared with several advanced methods in this section. The comparison results obtained for Flevoland dataset using 100 training samples pe class are reported in Table 13. It can be seen that the proposed SGCN-LG model provides the highest OA and kappa coefficient. The use of graph structures on SPs with fusion of two local and global views leads to the best performance. In term of the AA, DFC ranks first, and with a bit difference, SGCN-LG ranks second. Generally, after SGCN-LG, the best performance is obtained by DFC, which benefits the advantages of pre-determined convolutional kernels, multi-view analysis, two-step discriminant analysis.

Table 13.

Comparison with some state-of-the-art methods.

Name of class	# samples	DFC	RCNN-AA	MMAL	SGCN-LG
Stembeans	6103	99.56	99.08	99.23	99.41
Peas	9111	98.83	97.38	97.99	99.95
Forest	14,944	99.92	97.57	96.75	99.59
Lucerne	9477	94.68	97.79	96.81	94.68
Wheat	17,283	99.15	95.25	91.55	99.86
Beet	10,050	98.69	97.64	96.18	98.64
Potatoes	15,292	96.55	97.54	96.44	98.11
Bare soil	3078	100.00	99.74	100.00	100.00
Grass	6269	99.22	95.09	91.42	99.98
Rapeseed	12,690	98.16	95.29	91.65	99.76
Barely	7156	99.80	99.43	99.83	100.00
Wheat 2	10,591	99.01	97.26	96.88	99.87
Wheat 3	21,300	99.32	96.67	97.04	99.97
Water	13,476	99.98	99.80	99.93	99.90
Buildings	476	99.79	93.49	84.87	90.55
AA		98.84	97.27	95.77	98.68
OA		98.72	97.26	96.15	99.26
K		98.61	97.01	95.80	99.19

Open in a new tab

and high confidence decision fusion. RCNN-AA that uses the convolutional autoencoder based attention for providing an appropriate input for residual CNN ranks third in terms of AA, OA and kappa coefficient. MMAL, which utilizes the cross-attention mechanism to compute the relationships among multi-level features, show less performance compared to other methods. The corresponding classification maps are shown in Fig. 8. The cleanest classification map with the least noise is achieved by SGCN-LG. After that, DFC provides the most accurate classification map. There are more number of noisy pixels in MMAL compared to other methods.

Fig. 8 — Classification maps of some state-of-the-art methods.

In the following, some advantages of the proposed SGCN-LG model, which leads to its high performance, are represented:

The use of SPs has some benefits such as noise reduction, providing spatial information and complexity reducing in graph computations.
With automatic determining the number of SPs , and the automatic determining the number of spatial neighboring SPs for construction of the local graph , there is not free parameter for making the graph model.
While the local information of each SP is obtained by its neighbors in adjacent areas, the global information containing knowledge of the classes presented in the labeled samples is extracted from the whole image.

Conclusion

Superpixels (SPs) are used as the composing nodes of graph for PolSAR image classification in this work. The number of SPs is automatically determined according to analysis of the differential operator applied to the standard deviation values in patches with different sizes. Two local and global graphs are constructed from the generated SPs where the local graph is made of adjacent SPs in local regions, and global graph is made of the nearest SPs from all classes. The local and global features are fused for classification. The relationships among spatial features are explored in the local graph and the global structure of the labeled SPs are extracted by the global graph. The proposed model with benefiting advantages of SPs, graph networks, and local and global feature fusion provides high accurate classification results with the same or smaller training sets compared to several state-of-the-art methods. However, there are some challenges which can be studied in future works. For example, considering the within-superpixel variations specially in heterogeneous regions is important in pixel based classification. Assigning different weights to different pixels of a given SP can enhance the classification map. Moreover, the SLIC method is used here for PolSAR image segmentation out of the network. But, training an adaptive SP generation block integrated with the classification network in a unified framework can lead to better alignments of class boundaries in the final classification result.

Author contributions

Maryam Imani has all roles of Conceptualization; Methodology; Software; Validation; Formal analysis; Investigation; Writing, review & editing.

Data availability

The datasets are available online in https://ietr-lab.univ-rennes1.fr/polsarpro-bio/san-francisco and https://github.com/fudanxu/CV-CNN?tab=readme-ov-file.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Duan, D. & Wang, Y. Reflection of and vision for the decomposition algorithm development and application in earth observation studies using PolSAR technique and data. Remote Sens. Environ.261, 112498 (2021). [Google Scholar]
2.Imani, M. Two-step discriminant analysis based multi-view polarimetric SAR image classification with high confidence. Sci. Rep.12, 5984 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Gomez, L., Alvarez, L., Mazorra, L. & Frery, A. C. Fully PolSAR image classification using machine learning techniques and reaction-diffusion systems. Neurocomputing255, 52–60 (2017). [Google Scholar]
4.Li, H., Chen, J., Li, Q., Wu, G. & Chen, J. Mitigation of reflection symmetry assumption and negative power problems for the model-based decomposition. IEEE Trans. Geosci. Remote Sens.54(12), 7261–7271 (2016). [Google Scholar]
5.Ghazvinizadeh, A. H., Imani, M. & Ghassemian, H. Residual network based on entropy–anisotropy–alpha target decomposition for polarimetric SAR image classification. Earth Sci. Inf.16, 357–366 (2023). [Google Scholar]
6.Bilal, M., Israr, H., Shahid, M. & Khan, A. Sentiment classification of Roman-Urdu opinions using Naïve Bayesian. Decision tree and KNN classification techniques. J. King Saud Univ. - Comput. Inform. Sci.28(3), 330–344 (2016). [Google Scholar]
7.Parand, K., Aghaei, A. A., Jani, M. & Ghodsi, A. Parallel LS-SVM for the numerical simulation of fractional Volterra’s population model. Alexandria Eng. J.60(6), 5637–5647 (2021). [Google Scholar]
8.Sánchez-Lladó, F. J., Pajares, G. & López-Martínez, C. Improving the wishart synthetic aperture radar image classifications through deterministic simulated annealing. ISPRS J. Photogrammetry Remote Sens.66(6), 845–857 (2011). [Google Scholar]
9.Shi, J., Wang, W., Jin, H. & He, T. Complex matrix and multi-feature collaborative learning for polarimetric SAR image classification. Appl. Soft Comput.134, 109965 (2023). [Google Scholar]
10.Imani, M. Entropy/anisotropy/alpha based 3DGabor filter bank for PolSAR image classification. Geocarto Int.37(27), 18491–18519 (2022). [Google Scholar]
11.Imani, M. Classification using ridge regression-based polarimetric-spatial feature extraction. In Polarimetric SAR. 2021 26th International Computer Conference, Computer Society of Iran (CSICC), Tehran, Iran, 1–5 (2021).
12.Chen, Y. et al. Nonlinear projective dictionary pair learning for PolSAR image classification. IEEE Access.9, 70650–70661 (2021). [Google Scholar]
13.Song, W., Wu, Y. & Guo, P. Composite kernel and hybrid discriminative random field model based on feature fusion for PolSAR image classification. IEEE Geosci. Remote Sens. Lett.18(6), 1069–1073 (2021). [Google Scholar]
14.Latif, S. D. et al. Assessing rainfall prediction models: exploring the advantages of machine learning and remote sensing approaches. Alexandria Eng. J.82, 16–25 (2023). [Google Scholar]
15.Imani, M. Integration of the k-nearest neighbours and patch-based features for PolSAR image classification by using a two-branch residual network. Remote Sens. Lett.12(11), 1112–1122 (2021). [Google Scholar]
16.Wang, J. et al. Parameter selection of Touzi decomposition and a distribution improved autoencoder for PolSAR image classification. ISPRS J. Photogrammetry Remote Sens.186, 246–266 (2022). [Google Scholar]
17.Imani, M. Low frequency and radar’s physical based features for improvement of convolutional neural networks for PolSAR image classification. Egypt. J. Remote Sens. Space Sci.25, 55–62 (2022). [Google Scholar]
18.Shang, R., Wang, J., Jiao, L., Yang, X. & Li, Y. Spatial feature-based convolutional neural network for PolSAR image classification. Appl. Soft Comput.123, 108922 (2022). [Google Scholar]
19.Zhang, P., Liu, C., Chang, X., Li, Y. & Li, M. Metric-based meta-learning model for few-shot PolSAR image terrain classification. In 2021 CIE International Conference on Radar (Radar), Haikou, Hainan, China, 2529–2533 (2021).
20.Imani, M. Residual convolutional neural network with autoencoder based attention for PolSAR image classification. In 2024 13th Iranian/3rd International Machine Vision and Image Processing Conference (MVIP), Tehran, Iran, 1–6. (2024).
21.Yang, Z., Wu, Y., Li, M., Hu, X. & Li, Z. Unsupervised change detection in PolSAR images using Siamese encoder–decoder framework based on graph-context attention network. Int. J. Appl. Earth Obs. Geoinf.124, 103511 (2023). [Google Scholar]
22.Ling, J., Wei, S., Gamba, P., Liu, R. & Zhang, H. Advancing SAR monitoring of urban impervious surface with a new polarimetric scattering mixture analysis approach. Int. J. Appl. Earth Obs. Geoinf.124, 103541 (2023). [Google Scholar]
23.Zhang, Z. C., Chen, Z. D., Wang, Y., Luo, X. & Xu, X. S. A vision transformer for fine grained classification by reducing noise and enhancing discriminative information. Pattern Recogn.145, 109979 (2024). [Google Scholar]
24.Dong, H., Zhang, L. & Zou, B. Exploring vision transformers for polarimetric SAR image classification. IEEE Trans. Geosci. Remote Sens.60, 1–15, Art 5219715 (2022).
25.Imani, M. Attention based multi-level and multi-scale convolutional network for PolSAR image classification. Adv. Space Res.75(11), 7971–7986 (2025). [Google Scholar]
26.Dong, H., Zhang, L., Lu, D. & Zou, B. Attention-based polarimetric feature selection convolutional network for PolSAR image classification. IEEE Geosci. Remote Sens. Lett.19, 1–5, Art 4001705 (2022).
27.Hua, W., Wang, X., Zhang, C. & Jin, X. Attention-based multiscale sequential network for PolSAR image classification. IEEE Geosci. Remote Sens. Lett.19, 1–5, Art 4506505 (2022).
28.Song, W., Wu, Y. & Xiao, X. Nonstationary PolSAR image classification by deep-features-based high-order triple discriminative random field. IEEE Geosci. Remote Sens. Lett.18(8), 1406–1410 (2021). [Google Scholar]
29.Choi, K. S. & Oh, K. W. Subsampling-based acceleration of simple linear iterative clustering for superpixel segmentation. Comput. Vis. Image Underst.146, 1–8 (2016). [Google Scholar]
30.Yin, J. et al. SLIC superpixel segmentation for polarimetric SAR images. IEEE Trans. Geosci. Remote Sens.60, 1–17, Art 5201317 (2022).
31.Li, M. et al. Efficient superpixel generation for polarimetric SAR images with cross-iteration and hexagonal initialization. Remote Sens.14, 2914 (2022).
32.Guo, Y. et al. Adaptive fuzzy learning superpixel representation for PolSAR image classification. IEEE Trans. Geosci. Remote Sens.60(1-18), Art 5217818 (2022). [Google Scholar]
33.Yang, S., Yuan, X., Liu, X. & Chen, Q. Superpixel generation for polarimetric SAR using hierarchical energy maximization. Comput. Geosci.135, 104395 (2020). [Google Scholar]
34.Li, M. et al. Superpixel generation for polarimetric SAR images with adaptive size estimation and determinant ratio test distance. Remote Sens.15, 1123 (2023). [Google Scholar]
35.Cao, Y., Wu, Y., Li, M., Liang, W. & Zhang, P. PolSAR image classification using a superpixel-based composite kernel and elastic net. Remote Sens.13(3), 380 (2021).
36.Ma, F., Zhang, F., Xiang, D., Yin, Q. & Zhou, Y. Fast task-specific region merging for SAR image segmentation. IEEE Trans. Geosci. Remote Sens.60, 1–16, Art 5222316 (2022).
37.Zhang, F., Sun, X., Ma, F. & Yin, Q. Superpixelwise likelihood ratio test statistic for PolSAR data and its application to built-up area extraction. ISPRS J. Photogrammetry Remote Sens.209, 233–248 (2024). [Google Scholar]
38.Hua, W., Zhang, C., Xie, W. & Jin, X. Polarimetric SAR image classification based on ensemble dual-branch CNN and superpixel algorithm. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens.15, 2759–2772 (2022). [Google Scholar]
39.Shi, J. et al. Polarimetric synthetic aperture radar image classification based on double-channel convolution network and edge-preserving Markov random field. Remote Sens.15, 5458 (2023). [Google Scholar]
40.Ren, J., Zhu, K., Hu, M., Shang, R. & Zhang, M. Polarimetric SAR image classification based on superpixel content-aware and semi-supervised ViT network. Appl. Soft Comput.186(Part A), 114040 (2026). [Google Scholar]
41.Liu, H. et al. Graph convolutional networks by architecture search for PolSAR image classification. Remote Sens.13, 1404 (2021).
42.Bi, H., Sun, J. & Xu, Z. A Graph-based semisupervised deep learning model for PolSAR image classification. IEEE Trans. Geosci. Remote Sens.57(4), 2116–2132. (2019).
43.Yu, B., Xie, H., Fu, Y. & Xu, Z. Three-way graph convolutional network for multi-label classification in multi-label information system. Appl. Soft Comput.161, 111767 (2024). [Google Scholar]
44.Xu, D. et al. Difference-guided multiscale graph convolution network for unsupervised change detection in PolSAR images. Neurocomputing555, 126611 (2023). [Google Scholar]
45.Ma, F., Zhang, F., Yin, Q., Xiang, D. & Zhou, Y. Fast SAR image segmentation with deep task-specific superpixel sampling and soft graph convolution. IEEE Trans. Geosci. Remote Sens.60, 1–16, Art 5214116 (2022).
46.Geng, J., Wang, R. & Jiang, W. Polarimetric SAR image classification based on feature enhanced superpixel hypergraph neural network. IEEE Trans. Geosci. Remote Sens.60(1–12), Art 5237812 (2022). [Google Scholar]
47.Shi, J., He, T., Ji, S., Nie, M. & Jin, H. CNN-Improved Superpixel-to-Pixel fuzzy graph Convolution network for PolSAR image classification. IEEE Trans. Geosci. Remote Sens.61, 1–18, Art 4410118 (2023).
48.Wang, R., Nie, Y. & Geng, J. Multiscale superpixel-guided weighted graph convolutional network for polarimetric SAR image classification. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens.17, 3727–3741 (2024).
49.Zhu, W., Zhao, C., Feng, S. & Qin, B. Multiscale short and long range graph convolutional network for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens.60, 1–15, Art 5535815 (2022).
50.Zhou, Q., Gao, Q., Wang, Q., Yang, M. & Gao, X. Sparse discriminant PCA based on contrastive learning and class-specificity distribution. Neural Netw.167, 775–786 (2023). [DOI] [PubMed] [Google Scholar]
51.Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR 2017), Toulon, France (2017).
52.Mou, L., Lu, X., Li, X. & Zhu, X. X. Nonlocal graph convolutional networks for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens.58(12), 8246–8257 (2020). [Google Scholar]
53.Imani, M. A random patches based edge preserving network for land cover classification using polarimetric synthetic aperture radar images. Int. J. Remote Sens.42(13), 4946–4964 (2021). [Google Scholar]
54.Roggo, Y., Duponchel, L. & Huvenne, J. P. Comparison of supervised pattern recognition methods with Mcnemar’s statistical test: application to qualitative analysis of sugar beet by near-infrared spectroscopy. Anal. Chim. Acta. 477(2), 187–200 (2003). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets are available online in https://ietr-lab.univ-rennes1.fr/polsarpro-bio/san-francisco and https://github.com/fudanxu/CV-CNN?tab=readme-ov-file.

[CR1] 1.Duan, D. & Wang, Y. Reflection of and vision for the decomposition algorithm development and application in earth observation studies using PolSAR technique and data. Remote Sens. Environ.261, 112498 (2021). [Google Scholar]

[CR2] 2.Imani, M. Two-step discriminant analysis based multi-view polarimetric SAR image classification with high confidence. Sci. Rep.12, 5984 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Gomez, L., Alvarez, L., Mazorra, L. & Frery, A. C. Fully PolSAR image classification using machine learning techniques and reaction-diffusion systems. Neurocomputing255, 52–60 (2017). [Google Scholar]

[CR4] 4.Li, H., Chen, J., Li, Q., Wu, G. & Chen, J. Mitigation of reflection symmetry assumption and negative power problems for the model-based decomposition. IEEE Trans. Geosci. Remote Sens.54(12), 7261–7271 (2016). [Google Scholar]

[CR5] 5.Ghazvinizadeh, A. H., Imani, M. & Ghassemian, H. Residual network based on entropy–anisotropy–alpha target decomposition for polarimetric SAR image classification. Earth Sci. Inf.16, 357–366 (2023). [Google Scholar]

[CR6] 6.Bilal, M., Israr, H., Shahid, M. & Khan, A. Sentiment classification of Roman-Urdu opinions using Naïve Bayesian. Decision tree and KNN classification techniques. J. King Saud Univ. - Comput. Inform. Sci.28(3), 330–344 (2016). [Google Scholar]

[CR7] 7.Parand, K., Aghaei, A. A., Jani, M. & Ghodsi, A. Parallel LS-SVM for the numerical simulation of fractional Volterra’s population model. Alexandria Eng. J.60(6), 5637–5647 (2021). [Google Scholar]

[CR8] 8.Sánchez-Lladó, F. J., Pajares, G. & López-Martínez, C. Improving the wishart synthetic aperture radar image classifications through deterministic simulated annealing. ISPRS J. Photogrammetry Remote Sens.66(6), 845–857 (2011). [Google Scholar]

[CR9] 9.Shi, J., Wang, W., Jin, H. & He, T. Complex matrix and multi-feature collaborative learning for polarimetric SAR image classification. Appl. Soft Comput.134, 109965 (2023). [Google Scholar]

[CR10] 10.Imani, M. Entropy/anisotropy/alpha based 3DGabor filter bank for PolSAR image classification. Geocarto Int.37(27), 18491–18519 (2022). [Google Scholar]

[CR11] 11.Imani, M. Classification using ridge regression-based polarimetric-spatial feature extraction. In Polarimetric SAR. 2021 26th International Computer Conference, Computer Society of Iran (CSICC), Tehran, Iran, 1–5 (2021).

[CR12] 12.Chen, Y. et al. Nonlinear projective dictionary pair learning for PolSAR image classification. IEEE Access.9, 70650–70661 (2021). [Google Scholar]

[CR13] 13.Song, W., Wu, Y. & Guo, P. Composite kernel and hybrid discriminative random field model based on feature fusion for PolSAR image classification. IEEE Geosci. Remote Sens. Lett.18(6), 1069–1073 (2021). [Google Scholar]

[CR14] 14.Latif, S. D. et al. Assessing rainfall prediction models: exploring the advantages of machine learning and remote sensing approaches. Alexandria Eng. J.82, 16–25 (2023). [Google Scholar]

[CR15] 15.Imani, M. Integration of the k-nearest neighbours and patch-based features for PolSAR image classification by using a two-branch residual network. Remote Sens. Lett.12(11), 1112–1122 (2021). [Google Scholar]

[CR16] 16.Wang, J. et al. Parameter selection of Touzi decomposition and a distribution improved autoencoder for PolSAR image classification. ISPRS J. Photogrammetry Remote Sens.186, 246–266 (2022). [Google Scholar]

[CR17] 17.Imani, M. Low frequency and radar’s physical based features for improvement of convolutional neural networks for PolSAR image classification. Egypt. J. Remote Sens. Space Sci.25, 55–62 (2022). [Google Scholar]

[CR18] 18.Shang, R., Wang, J., Jiao, L., Yang, X. & Li, Y. Spatial feature-based convolutional neural network for PolSAR image classification. Appl. Soft Comput.123, 108922 (2022). [Google Scholar]

[CR19] 19.Zhang, P., Liu, C., Chang, X., Li, Y. & Li, M. Metric-based meta-learning model for few-shot PolSAR image terrain classification. In 2021 CIE International Conference on Radar (Radar), Haikou, Hainan, China, 2529–2533 (2021).

[CR20] 20.Imani, M. Residual convolutional neural network with autoencoder based attention for PolSAR image classification. In 2024 13th Iranian/3rd International Machine Vision and Image Processing Conference (MVIP), Tehran, Iran, 1–6. (2024).

[CR21] 21.Yang, Z., Wu, Y., Li, M., Hu, X. & Li, Z. Unsupervised change detection in PolSAR images using Siamese encoder–decoder framework based on graph-context attention network. Int. J. Appl. Earth Obs. Geoinf.124, 103511 (2023). [Google Scholar]

[CR22] 22.Ling, J., Wei, S., Gamba, P., Liu, R. & Zhang, H. Advancing SAR monitoring of urban impervious surface with a new polarimetric scattering mixture analysis approach. Int. J. Appl. Earth Obs. Geoinf.124, 103541 (2023). [Google Scholar]

[CR23] 23.Zhang, Z. C., Chen, Z. D., Wang, Y., Luo, X. & Xu, X. S. A vision transformer for fine grained classification by reducing noise and enhancing discriminative information. Pattern Recogn.145, 109979 (2024). [Google Scholar]

[CR24] 24.Dong, H., Zhang, L. & Zou, B. Exploring vision transformers for polarimetric SAR image classification. IEEE Trans. Geosci. Remote Sens.60, 1–15, Art 5219715 (2022).

[CR25] 25.Imani, M. Attention based multi-level and multi-scale convolutional network for PolSAR image classification. Adv. Space Res.75(11), 7971–7986 (2025). [Google Scholar]

[CR26] 26.Dong, H., Zhang, L., Lu, D. & Zou, B. Attention-based polarimetric feature selection convolutional network for PolSAR image classification. IEEE Geosci. Remote Sens. Lett.19, 1–5, Art 4001705 (2022).

[CR27] 27.Hua, W., Wang, X., Zhang, C. & Jin, X. Attention-based multiscale sequential network for PolSAR image classification. IEEE Geosci. Remote Sens. Lett.19, 1–5, Art 4506505 (2022).

[CR28] 28.Song, W., Wu, Y. & Xiao, X. Nonstationary PolSAR image classification by deep-features-based high-order triple discriminative random field. IEEE Geosci. Remote Sens. Lett.18(8), 1406–1410 (2021). [Google Scholar]

[CR29] 29.Choi, K. S. & Oh, K. W. Subsampling-based acceleration of simple linear iterative clustering for superpixel segmentation. Comput. Vis. Image Underst.146, 1–8 (2016). [Google Scholar]

[CR30] 30.Yin, J. et al. SLIC superpixel segmentation for polarimetric SAR images. IEEE Trans. Geosci. Remote Sens.60, 1–17, Art 5201317 (2022).

[CR31] 31.Li, M. et al. Efficient superpixel generation for polarimetric SAR images with cross-iteration and hexagonal initialization. Remote Sens.14, 2914 (2022).

[CR32] 32.Guo, Y. et al. Adaptive fuzzy learning superpixel representation for PolSAR image classification. IEEE Trans. Geosci. Remote Sens.60(1-18), Art 5217818 (2022). [Google Scholar]

[CR33] 33.Yang, S., Yuan, X., Liu, X. & Chen, Q. Superpixel generation for polarimetric SAR using hierarchical energy maximization. Comput. Geosci.135, 104395 (2020). [Google Scholar]

[CR34] 34.Li, M. et al. Superpixel generation for polarimetric SAR images with adaptive size estimation and determinant ratio test distance. Remote Sens.15, 1123 (2023). [Google Scholar]

[CR35] 35.Cao, Y., Wu, Y., Li, M., Liang, W. & Zhang, P. PolSAR image classification using a superpixel-based composite kernel and elastic net. Remote Sens.13(3), 380 (2021).

[CR36] 36.Ma, F., Zhang, F., Xiang, D., Yin, Q. & Zhou, Y. Fast task-specific region merging for SAR image segmentation. IEEE Trans. Geosci. Remote Sens.60, 1–16, Art 5222316 (2022).

[CR37] 37.Zhang, F., Sun, X., Ma, F. & Yin, Q. Superpixelwise likelihood ratio test statistic for PolSAR data and its application to built-up area extraction. ISPRS J. Photogrammetry Remote Sens.209, 233–248 (2024). [Google Scholar]

[CR38] 38.Hua, W., Zhang, C., Xie, W. & Jin, X. Polarimetric SAR image classification based on ensemble dual-branch CNN and superpixel algorithm. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens.15, 2759–2772 (2022). [Google Scholar]

[CR39] 39.Shi, J. et al. Polarimetric synthetic aperture radar image classification based on double-channel convolution network and edge-preserving Markov random field. Remote Sens.15, 5458 (2023). [Google Scholar]

[CR40] 40.Ren, J., Zhu, K., Hu, M., Shang, R. & Zhang, M. Polarimetric SAR image classification based on superpixel content-aware and semi-supervised ViT network. Appl. Soft Comput.186(Part A), 114040 (2026). [Google Scholar]

[CR41] 41.Liu, H. et al. Graph convolutional networks by architecture search for PolSAR image classification. Remote Sens.13, 1404 (2021).

[CR42] 42.Bi, H., Sun, J. & Xu, Z. A Graph-based semisupervised deep learning model for PolSAR image classification. IEEE Trans. Geosci. Remote Sens.57(4), 2116–2132. (2019).

[CR43] 43.Yu, B., Xie, H., Fu, Y. & Xu, Z. Three-way graph convolutional network for multi-label classification in multi-label information system. Appl. Soft Comput.161, 111767 (2024). [Google Scholar]

[CR44] 44.Xu, D. et al. Difference-guided multiscale graph convolution network for unsupervised change detection in PolSAR images. Neurocomputing555, 126611 (2023). [Google Scholar]

[CR45] 45.Ma, F., Zhang, F., Yin, Q., Xiang, D. & Zhou, Y. Fast SAR image segmentation with deep task-specific superpixel sampling and soft graph convolution. IEEE Trans. Geosci. Remote Sens.60, 1–16, Art 5214116 (2022).

[CR46] 46.Geng, J., Wang, R. & Jiang, W. Polarimetric SAR image classification based on feature enhanced superpixel hypergraph neural network. IEEE Trans. Geosci. Remote Sens.60(1–12), Art 5237812 (2022). [Google Scholar]

[CR47] 47.Shi, J., He, T., Ji, S., Nie, M. & Jin, H. CNN-Improved Superpixel-to-Pixel fuzzy graph Convolution network for PolSAR image classification. IEEE Trans. Geosci. Remote Sens.61, 1–18, Art 4410118 (2023).

[CR48] 48.Wang, R., Nie, Y. & Geng, J. Multiscale superpixel-guided weighted graph convolutional network for polarimetric SAR image classification. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens.17, 3727–3741 (2024).

[CR49] 49.Zhu, W., Zhao, C., Feng, S. & Qin, B. Multiscale short and long range graph convolutional network for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens.60, 1–15, Art 5535815 (2022).

[CR50] 50.Zhou, Q., Gao, Q., Wang, Q., Yang, M. & Gao, X. Sparse discriminant PCA based on contrastive learning and class-specificity distribution. Neural Netw.167, 775–786 (2023). [DOI] [PubMed] [Google Scholar]

[CR51] 51.Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR 2017), Toulon, France (2017).

[CR52] 52.Mou, L., Lu, X., Li, X. & Zhu, X. X. Nonlocal graph convolutional networks for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens.58(12), 8246–8257 (2020). [Google Scholar]

[CR53] 53.Imani, M. A random patches based edge preserving network for land cover classification using polarimetric synthetic aperture radar images. Int. J. Remote Sens.42(13), 4946–4964 (2021). [Google Scholar]

[CR54] 54.Roggo, Y., Duponchel, L. & Huvenne, J. P. Comparison of supervised pattern recognition methods with Mcnemar’s statistical test: application to qualitative analysis of sugar beet by near-infrared spectroscopy. Anal. Chim. Acta. 477(2), 187–200 (2003). [Google Scholar]

PERMALINK

Superpixel-based graph convolutional neural network for polarimetric synthetic aperture radar image classification

Maryam Imani

Abstract

Introduction

Proposed SGCN-LG model

Fig. 1.

Fig. 2.

Superpixels generation with automatic determining the number of superpixels

Fig. 3.

Fig. 4.

Graph construction

SGCN-L

SGCN-G

Feature fusion and classification

Experiments

Datasets and parameter settings

Assessment of parameters’ effects

Table 1.

Table 2.

Table 3.

Classification results

Table 4.

Table 5.

Fig. 5.

Table 6.

Table 7.

Fig. 6.

Table 8.

Table 9.

Fig. 7.

Table 10.

Table 11.

Table 12.

Comparison with several state-of-the-art methods

Table 13.

Fig. 8.

Conclusion

Author contributions

Data availability

Declarations

Competing interests

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases