Abstract
Remote sensing has a large amount of content information and high complexity, making it difficult to annotate image, high demand for professional knowledge, and high economic cost. There are significant differences in the quality of data labels obtained, which poses various information limited challenges in object detection. Data labels with significant differences in quality pose various challenges with limited information for object detection. Therefore, a new oriented object detection technique based on angle quality estimation (AQE-detector) was designed to overcome the challenges of missing confidence information on oriented angle and insufficient object localization accuracy. Firstly, a periodic Gaussian distribution was used to model the oriented angle variable, transforming the angle regression into the distribution estimation, greatly improving the accuracy of the angle and implicitly estimating the potential confidence. Based on this, the non-maximum suppression based on angle quality (NMS-AQ) was proposed to alleviate the confirmation bias caused by existing methods that only use classification confidence to evaluate detection results. An angle loss function based on aspect-ratio perception (Double-A Loss) was designed to effectively improving the overall detection performance. The intelligent object detection technology under information limited conditions has been studied, facilitating the efficient application of object detection in major demand fields, such as remote sensing and non-destructive testing.
Keywords: Object detection, Angle quality estimation, Remote sensing, Oriented angle, Aspect-ratio perception
Subject terms: Engineering, Mathematics and computing
Introduction
At present, the emergence of satellite image sharing platforms such as Google Earth and Amap has led to an increasing scale of high-resolution images. Intelligent interpretation of remote sensing has received more attention in various fields. Object detection is widely used in locating targets of interest, such as airplanes and ships, and in many extensions such as military reconnaissance, road rescue, and environmental monitoring1–3. The targets in remote sensing through on-board or satellite imaging sensors often exhibit arbitrary directional distributions. It is difficult to accurately locate remote sensing objects with oriented angles using general detection algorithms4–6, which will cause problems such as overlapping between object bounding boxes and background within object boxes, greatly affecting the detection performance7.
Many detection methods based on oriented bounding box8–12 have been proposed and have made encouraging progress. Wang et al.13 introduced an angle prediction branch based on the general object detector - Faster RCNN, accurately predicting the orientation of remote sensing objects. Song et al.14 introduced the angle prediction branch into the object candidate region generation network to generate a series of object candidate regions with oriented angles, enhancing the feature extraction ability for oriented objects. On this basis, Liu et al.15 fully explored the multi-scale features of remote sensing through a densely connected feature pyramid mechanism, overcoming scale fluctuations. Jin et al.16 used a model to automatically learn the parameter transformation relationship between candidate region boxes and the oriented objects, thereby accurately extracting oriented object features. Ravi et al.17 proposed an L1 loss function based on IoU to eliminate the inconsistency between angle loss function and localization performance. Mandi et al.18 proposed a modulation loss function for direct regression of target vertices, which avoids the inconsistency between the loss function of the oriented object and its localization performance by reordering each vertex. Zheng et al.19 designed an oriented feature refinement module to improve the detection performance of target regions. Furthermore, Chen et al.20 proposed a sample allocation method to mitigate the difference between classification and the localization accuracy of detection box.
Although the above-mentioned algorithm based on oriented bounding box has made some progress, there are still three urgent challenges that need to be addressed: (1) Existing methods usually treat angles as ordinary variables for regression, without considering their periodic characteristics, resulting in inconsistency between the monotonicity of the loss function and the periodicity of angles. Although recent studies such as Gaussian Wasserstein Distance (GWD)21 and Circular Smooth Label (CSL)22 have attempted to address this issue from a distribution modeling perspective, they still have limitations. GWD converts the oriented bounding box into a 2D Gaussian distribution and calculates the distribution distance, which is a complex process and does not explicitly model the boundary problem of the angular period. CSL transforms angle regression into a classification problem. Although it avoids cycle jumps, its predictions are deterministic and cannot provide confidence (or quality) information for angle predictions. (2) Most existing algorithms only use classification confidence to evaluate the performance, while ignoring the quality of predicted angles that are significantly related to localization performance. Due to the inconsistent difference between classification confidence and localization accuracy, detection results with lower classification confidence but more accurate localization are suppressed. Most existing algorithms only use classification confidence to evaluate the quality of detection boxes, while ignoring the significant correlation between angle quality and localization accuracy. Although previous studies such as IoU-aware NMS or Quality-aware NMS have proposed using IoU or other quality indicators to assist NMS, these indicators are usually focused on horizontal boxes, and have not specifically evaluated the quality of the “angle” dimension in oriented object detection. (3) Most existing studies have overlooked the correlation between angle prediction and object aspect-ratio. Applying the same level of supervision to all objects can make it difficult to fully learn accurate angles for angle sensitive objects, and vice versa, it can generate redundant supervision information.
Therefore, a new oriented object detection technique based on angle quality estimation (AQE-detector) was proposed. This method overcomes the phenomenon of inconsistent loss function values and target localization results caused by periodic angles, thereby significantly improving the accuracy of angle prediction; Next, the process proposes the non-maximum suppression based on angle quality (NMS-AQ). By comprehensively considering the angle quality and classification confidence, the detection is comprehensively evaluated to alleviate the inconsistency in confidence and positioning between the two; In addition, the relationship between angle sensitivity and object aspect-ratio was quantitatively analyzed, and an angle loss function based on aspect-ratio perception (Double-A Loss) was designed.
A new oriented object detection technique based on angle quality estimation (AQE-detector) has been proposed. The problem of inconsistent optimization function values and object detection results caused by angle periodicity has been overcome, significantly improving the accuracy of angle prediction and implicitly estimating the quality information.
Non-maximum suppression based on angle quality (NMS-AQ) was designed. By comprehensively considering the angle quality and classification confidence, the detection is comprehensively evaluated to alleviate the inconsistency in confidence and positioning between the two.
An angle loss function based on aspect-ratio perception (Double-A Loss) was proposed. This function adaptively adjusts the loss weight of the angle prediction branch based on the aspect ratio, reducing redundant supervision information.
Methods
A new oriented object detection technique is studied for coping the inconsistency between angle periodicity and loss function monotonicity, inconsistency between classification confidence and localization accuracy, inconsistency between objects with different geometric characteristics and the same angle supervision intensity. Figure 1 shows the model structure. Figure 1 illustrates the overall architecture of the proposed AQE-Detector, which features three specialized branches for angle prediction: Angle Classification, Angle Regression, and Angle Quality Estimation. The Angle Classification branch follows the conventional approach similar to CSL22, predicting discrete angle categories through a 180-dimensional output representing probability distribution over angle classes. This branch provides a deterministic angle prediction based on the highest probability category. The Angle Regression branch employs direct regression to predict the continuous angle value, treating angle as an ordinary regression variable without considering its periodic nature. The Angle Quality Estimation (AQE) branch, which is the core contribution of this work, models the angle as a periodic Gaussian distribution. It predicts both the mean and variance parameters, with the 180-dimensional output representing the discretized probability density function of the continuous angle distribution. This probabilistic formulation naturally generates angle quality scores that reflect prediction uncertainty. During inference, the AQE branch’s mean prediction is used as the final angle estimate, while the variance-derived quality score is integrated with classification confidence in the proposed NMS-AQ to comprehensively evaluate detection quality.
Fig. 1.
Overall structure of oriented object detection.
This method overcomes the phenomenon of inconsistent loss function values and target localization results caused by periodic angles, thereby significantly improving the accuracy of angle prediction. Afterwards, the non-maximum suppression based on angle quality (NMS-AQ) was proposed. By comprehensively considering the angle quality and classification confidence, the detection is comprehensively evaluated to alleviate the inconsistency in confidence and positioning between the two. Finally, the relationship between angle sensitivity and object aspect-ratio was quantitatively analyzed, and an angle loss function based on aspect-ratio perception was designed. In the Double-A Loss illustration, the red oriented bounding boxes represent angle-sensitive objects with high aspect ratios (e.g., ships, large vehicles), which require stronger angle supervision; the blue dashed boxes represent angle-insensitive objects with low aspect ratios (e.g., storage tanks, roundabouts), for which the angle loss is adaptively reduced. The curve depicts how the loss weight adapts based on the object’s aspect ratio.
Angle quality estimation
Angle variable modeling: Gaussian distribution functions with periodicity are used to model angle variables. Specifically, an angle quality estimation branch was constructed to predict mean and variance:
![]() |
FC AQE represents the angle quality estimation branch, while Fobj is a feature map of the target area extracted by pooling operations on the region of interest using a backbone network23. The variables µ and σ predicted by the fully connected layer are unbounded and cannot be directly used to characterize the variables under Gaussian distribution. θµ and θσ are the mean and standard deviation of the angle, respectively. They need to be limited to the range of angle definitions. To this end, the range of the mean value of the angle variable is limited to 0 to π, and its standard deviation is limited to 0 to 1:
![]() |
The angle variable is modeled as:
![]() |
Among them, αθ is the periodic factor. After modeling the angle variable, θσ can implicitly reveal the quality information24. Next, the true value label of the angle is modeled as a Dirac distribution:
![]() |
At this point, the transformation from angle prediction task to distribution estimation task has been completed. N training samples are used to optimize the weight, so that the angle distribution P(θ) approaches the label distribution δ(θ):
![]() |
Distance (|) is a metric function used to estimate the difference between two distributions. By representing angle variables in the form of periodic Gaussian distributions, the phenomenon of sudden changes in angle prediction loss values at the boundary of the defined range due to the inconsistency between angle periodicity and loss function monotonicity is avoided, thereby improving the accuracy of angle prediction.
Optimization of angle quality estimation branch: Converting the angle distribution into discrete variables reduces the difficulty of model optimization. Specifically, the angle continuous distribution established above is sampled at equal intervals to obtain an angle Gaussian vector of length M, which is called LG(x):
![]() |
Among them,
, x represents the discrete value of angle, which is the sampling point used to calculate the probability density function of Gaussian distribution. Gaussian random vector with angular direction is calculated:
![]() |
Among them, the resolution Rθ of the angle variable was set to 1 through parameter experimental analysis. Similarly, the label distribution δ(θ) is also encoded:
![]() |
Through this approach, the distribution estimation task has been successfully simplified into a classification problem, making it easier to train and optimize the model25.
Angle quality estimation branch inference: In the inference stage, the model’s predicted θµ from the AQE branch is directly and solely regarded as the object’s oriented angle to generate the final detection results. The predictions from the Angle Classification and Angle Regression branches are only utilized during the training phase to enrich feature learning and provide auxiliary supervision. Based on the predicted θσ, the angle quality information is calculated using the following formula:
![]() |
Based on angle quality, AQ-NMS is constructed in the subsequent stage. The angle quality and classification confidence are comprehensively considered to re evaluate the detection results.
Theoretical analysis: A theoretical analysis was conducted on whether the variable θσ can represent the quality information. The objective function based on cross entropy loss in Θ is transformed into:
![]() |
Among them,
can be simplified and decomposed into21:
According to αθ, it can be deduced that
. Therefore, the above equation can be rewritten:
![]() |
Among them,
needs to be minimized, so θσ will converge to 0. On the other hand,
needs to be maximized. According to αθ, when θσ approaches θgt, αθ will tend towards 0. Therefore,
can be rewritten as
, resulting in θµ and θσ tending towards θgt and 1, respectively. Therefore, as θµ approaches θgt, θσ decreases25,26.
Angle Quality Estimation (AQE) is fundamentally different from works such as Gaussian Wasserstein Distance (GWD)21 and Circular Smooth Label (CSL)22. CSL is a deterministic classification method based on label smoothing, whose output is the probability of each angle category, but it cannot provide continuous estimation of angle uncertainty. The core advantage of the AQE lies in: (1) Directness: Directly modeling the one-dimensional probability distribution of the angle as a periodic variable, the principle is more intuitive; (2) Information richness: The mean (θµ) and uncertainty (θσ) of angle prediction are naturally generated, namely angle quality; (3) Integration: The quality score of the output can be directly and seamlessly used to improve post-processing processes (such as non-maximum suppression based on angle quality). This “modeling-estimation-application” closed loop is the unique contribution of this study.
Differentiation from Multi-branch Angle Prediction: The proposed AQE-detector incorporates three parallel branches for angle prediction, each with distinct characteristics and purposes. The Angle Classification branch, similar to CSL22, provides discrete angle predictions but lacks uncertainty quantification. The Angle Regression branch offers continuous angle estimates but suffers from periodicity boundary issues. In contrast, the AQE branch uniquely combines the advantages of both: it provides continuous angle estimation through probabilistic modeling while naturally generating quality scores that reflect prediction confidence. This multi-branch design allows for comprehensive angle representation: the classification branch captures discrete angular patterns, the regression branch provides direct continuous estimates, and the AQE branch offers probabilistic uncertainty awareness. During training, all three branches contribute to the learning process, but during inference, only the AQE branch’s predictions are utilized, as they encapsulate both angle estimation and quality assessment in a unified framework. This represents a significant advancement over single-branch approaches like CSL22 or pure regression methods, enabling more robust and reliable oriented object detection.
Non-maximum suppression based on angle quality
Non-maximum Suppression (NMS) is usually applied to a series of candidate detection boxes. The conventional non-maximum suppression operation only uses the detection box as the evaluation criterion, suppressing detection boxes with lower classification confidence27. However, some oriented detection boxes with lower classification confidence actually have higher localization accuracy, making it difficult to accurately preserve high-precision detection boxes. Considering the impact of angle prediction on the localization, the non-maximum suppression based on angle quality (NMS-AQ) is proposed.
Specifically, any detection box is represented as
. AQ-NMS combines the classification confidence with the angle quality to generate a comprehensive confidence S:
![]() |
Among them, scorecls are classification scores, and ξ is a weight factor used to control the fusion ratio of classification confidence and angle quality28. AQ-NMS can effectively filter out inaccurate detection boxes.
Non-maximum Suppression (NMS) is a key step in post-processing of object detection. Traditional NMS relies solely on classification confidence ranking, which has many limitations. Therefore, a series of improvement methods have been proposed. Soft-NMS29 alleviates dense target problems through attenuation rather than inhibition; Methods such as IoU-aware NMS30 and Fitness-NMS31 introduce IoU or fit between predicted and real boxes as new scoring criteria to better reflect localization accuracy. These methods can be collectively referred to as Quality-aware NMS.
Comparison with Existing NMS Methods. The proposed NMS-AQ is fundamentally different from and addresses specific limitations of existing NMS methods in the context of oriented object detection. Traditional & Soft-NMS29: These methods rely solely on classification confidence (scorecls) for ranking and suppression. This often leads to the suppression of detection boxes with high localization accuracy but slightly lower classification scores, due to the common mismatch between classification and localization confidence. Quality-aware NMS (e.g., IoU-aware30, Fitness-NMS31: These methods introduce a general localization quality metric (e.g., IoU) to form a combined score (e.g., S = scorecls × IoU). While this is an improvement, the IoU is a holistic measure resulting from all bounding box parameters (center point, width, height, and angle). It cannot specifically and explicitly quantify the uncertainty or quality of the angle prediction alone, which is crucial for oriented objects. Proposed NMS-AQ: Our method introduces a dedicated angle quality score (qualityθ) derived directly from the probabilistic output of the AQE branch. This score explicitly measures the reliability of the angle prediction. By fusing qualityθ with score ~ cls~ (i.e., S = scorecls + ξ · qualityθ), NMS-AQ can prioritize boxes with highly reliable orientations. This is particularly advantageous for elongated objects (e.g., ships, vehicles) where a small angle error can cause a significant IoU drop, ensuring that detections with precise angles are preserved even if their classification scores are not the highest.
Angle loss function based on aspect-ratio perception
Objects are highly sensitive to angle changes, such as large vehicles and ships in ports, so slight deviations in oriented angles can lead to significant differences in their localization accuracy. In existing detection methods, the same level of angle supervision is usually applied to various types of objects, which makes it difficult for angle sensitive objects to receive sufficient training, and generates a lot of redundant supervision information for angle insensitive objects. Specifically, for angle insensitive objects, the correlation between their localization accuracy and oriented angle is weak32,33. Therefore, applying too much supervision information to these objects is redundant, resulting in excessive redundant supervision information. For angle sensitive objects, angle prediction requires the application of more supervisory information. An angle loss function based on aspect-ratio perception (Double-A Loss) is proposed to address the above issues, dynamically adjusting the weight in combination with the geometric structure of different objects. The name ‘Double-A’ signifies that the loss incorporates dual considerations of Aspect-ratio and Angle, which are the core factors it adapts to.
The coefficients λL, λQ, λG, ζL, ζQ, and ζG for the linear, quadratic, and Gaussian transformation functions were determined through a curve fitting process. A theoretical relationship between aspect ratio and desired angle sensitivity was defined: sensitivity should peak at very high and very low aspect ratios and be lowest for near-square objects. The idealized trend curve was plotted. Subsequently, the least squares method was employed to optimize the parameters of each candidate function (linear ASL, quadratic ASQ, and Gaussian ASG32 to best fit this idealized curve. Using a series of transformation functions (ASL, ASQ, and ASG) to characterize the trend of change, and approximating the curve using the least squares method:
![]() |
Among them, the values λL = -1.376, λQ = -2.013, λG = -3.386, ζL = 1.078, ζQ = 0.916, and ζG = -0.044, respectively, are the optimal parameters resulting from this fitting process, which minimizes the total squared error between the function output and our predefined trend. By using this conversion function, it is possible to adaptively adjust the angle supervision strength of objects with different aspect ratios, thereby reducing redundant angle losses, as shown below:
![]() |
In addition, although AS* effectively adjusts its angle loss intensity for different objects, it neglects the adjustment of the loss proportion between simple and difficult samples33. A well-known solution to the easy/hard sample imbalance problem is the Focal Loss34. While Double-A Loss adopts a similar modulating factor (γ) to dynamically scale the loss, its core purpose and application are fundamentally different. Focal Loss addresses the foreground-background class imbalance by down-weighting the loss for well-classified background examples (easy negatives). In contrast, Double-A Loss addresses the disparity in angle sensitivity among different objects. The modulating factor here is applied to down-weight the angle loss for objects that are inherently less sensitive to angle errors (typically those with low aspect ratios), regardless of their class. Therefore, the innovation of Double-A Loss lies not in the modulating mechanism itself, but in its novel integration with the aspect-ratio-sensitive weighting (AS*). It creates a unified loss function that simultaneously addresses sample-level difficulty (easy/hard) and object-level geometric property (angle sensitivity), which is a new challenge specific to oriented object detection. The loss weights of each objective are adaptively adjusted, and the final angle loss function is:
![]() |
Using angle sensitivity control to penalize difficult samples can effectively avoid redundant angle supervision on objects with lower aspect ratios and improve overall detection accuracy35.
Multi-task joint optimization
A plug and play angle prediction method has been studied, which can be combined with any object detector. The construction of the multi-task loss function refers to the classical object detector Faster-RCNN:
![]() |
Among them, Losscls, Lossbbox, and LossDouble−A represent the classification loss, regression loss, and prediction loss, respectively36. For angle prediction loss, the proposed Double-A Loss function is used. The hyperparameters λc, λb, and λθ, which balance the multi-task loss function, were determined through a grid search on the validation set (DOTA validation split). We aimed to find a combination that allows all tasks (classification, regression, and angle prediction) to converge harmoniously. The search range for parameter λc is [0.1, 0.5, 1.0, 2.0, 3.0]; the search range for parameter λb is [0.1, 0.5, 1]; the search range for parameter λθ is [0.1, 0.2, 0.3, 0.4, 0.5]. The final combination (λc = 2.0, λb = 1.0, λθ = 0.2) was selected as it yielded the highest overall mAP, indicating a good balance between the different learning objectives.
It is important to note that during inference, the model operates in a streamlined manner. Only the predictions from the AQE branch (θµ for the angle and scorecls for classification) are used to form the final oriented bounding boxes. The Angle Classification and Regression branches are disabled at this stage. This approach allows us to leverage the benefits of multi-task learning during training—where each branch contributes to learning a more generalized feature representation—while maintaining a simple and efficient inference pipeline that capitalizes on the superior accuracy and inherent quality estimation of the AQE method.
Experimental setup
Data and setup
Different algorithms were evaluated on three datasets: DOTA37, HRSC201638, and ICDAR201539.
HRSC2016 is an oriented objects dataset. This dataset mainly includes two scenarios: offshore ships and nearshore ships. The images are divided into a training set (617 images) and a testing set (444 images), with a resolution of 300 × 300 pixels ~ 1500 × 900 pixels.
DOTA is an oriented objects dataset. The dataset is divided into training dataset (10276 images), validation dataset (2957 images), and testing dataset (10833 images), with a resolution of 1024 × 1024 pixels.
To further validate the effectiveness in other detection fields, it was validated on the ICDAR2015 dataset. ICDAR2015 is divided into a training set (1000 images) and a testing set (500 images), with a resolution of 720 × 1280 pixels, used for detecting and recognizing oriented text in the scene.
The operating system used in the experiment is Ubuntu 16.04LTS, the processor model is Intel Xeon(R) CPU E5-2680v4@2.40 GHz×51, the memory size is 128GB, and the graphics card is NVIDIA Ge Force GTX2080Ti. The programming language used in the experiment is Python 3.7, and the deep learning framework is PyTorch. The software tools used include Python 3.7 (https://www.python.org), PyTorch (deep learning framework, https://pytorch.org), OpenCV 4.5.4 (image processing, https://opencv.org), and Matplotlib 3.5.1 (visualization, https://matplotlib.org). The pre trained weights on ImageNet are used to initialize the backbone network (i.e. ResNet), and the weight parameters of other network layers are randomly initialized using a zero mean normal distribution with a standard deviation of 0.01. Stochastic Gradient De-scent (SGD) was selected to optimize the model and set the initial learning rate to 0.01. The learning rate decreased by 10 times in the 8th and 11th training batches, with a total of 24 training batches trained and the batch size set to 4. Data augmentation techniques including random horizontal flipping, random rotation, and random scaling are adopted to enhance the robustness of the detector, and the target detection threshold is set to 0.001.
Evaluation metrics
The average accuracy indicators of categories include the conventional accuracy indicator mean average precision (mAP) when the IoU is 0.5, and the high-precision indicator mAP75 when the IoU is 0.75.
![]() |
Among them, NR is set to 11, and R and P respectively represent recall (R) and precision (P):
![]() |
Among them, TP, FP, and FN respectively represent the number of true positives, false positives, and false negatives. If the IoU between the detection result and the target box exceeds the threshold, the detection result is considered a true positive, otherwise it is considered a false positive.
Results analysis
Ablation experiment
Effectiveness of the angle quality estimation module (AQE).
The experimental results in Table 1 are all reproduced under the same conditions, with each row showing the best performance highlighted in bold, and the underlined numbers representing the second best performance. Table 1 presents the experimental results of models with different modules added on different types of targets. The different modules of our method have been proven to significantly improve the comprehensive detection efficiency of various targets. As this method overcomes the inconsistency between angle periodicity and loss function monotonicity, it makes the model localization performance more accurate. Meanwhile, it can be observed that the introduction of the angle quality estimation module demonstrates greater improvement in the high-precision metric mAP75. This method has significant advantages for angle sensitive objects, such as large vehicles, ships, and harbor, achieving detection accuracy improvements of 6.22%, 7.91%, and 3.01%.
Table 1.
Results of AQE-Detector on DOTA validation set (mAP and mAP75).
| mAP | mAP75 | |||||||
|---|---|---|---|---|---|---|---|---|
| AQE | × | √ | √ | √ | × | √ | √ | √ |
| Double-A Loss | × | × | √ | √ | × | × | √ | √ |
| NMS-AQ | × | × | × | √ | × | × | × | √ |
| Plane | 90.01 | 90.00 | 90.00 | 89.55 | 66.23 | 69.61 | 67.09 | 69.87 |
| BD | 76.97 | 77.53 | 78.85 | 79.26 | 27.72 | 31.16 | 28.19 | 33.21 |
| Bridge | 45.61 | 47.54 | 47.18 | 47.30 | 6.29 | 7.94 | 11.77 | 10.70 |
| GTF | 69.52 | 69.10 | 72.31 | 72.51 | 39.76 | 41.48 | 41.87 | 42.03 |
| SV | 65.81 | 66.06 | 66.10 | 66.30 | 26.15 | 26.80 | 27.87 | 28.14 |
| LV | 78.05 | 79.05 | 79.45 | 80.22 | 38.52 | 44.74 | 46.21 | 46.26 |
| Ship | 87.17 | 88.11 | 88.22 | 88.25 | 34.72 | 42.63 | 42.41 | 42.76 |
| TC | 90.84 | 90.86 | 90.87 | 90.85 | 79.92 | 81.02 | 79.82 | 83.98 |
| BC | 64.28 | 69.56 | 69.00 | 68.88 | 49.33 | 52.90 | 57.69 | 57.76 |
| ST | 87.95 | 87.63 | 87.68 | 87.68 | 55.30 | 54.75 | 56.66 | 56.73 |
| SBF | 72.51 | 72.64 | 77.84 | 78.27 | 46.32 | 48.66 | 43.75 | 44.58 |
| RA | 65.91 | 69.14 | 70.01 | 70.19 | 35.36 | 34.37 | 35.01 | 34.66 |
| Harbor | 64.81 | 65.46 | 65.23 | 65.83 | 12.39 | 15.40 | 20.30 | 20.96 |
| SP | 64.36 | 64.37 | 64.81 | 65.41 | 5.30 | 11.60 | 11.53 | 11.92 |
| HC | 48.33 | 50.23 | 56.19 | 55.45 | 5.88 | 10.47 | 13.94 | 15.93 |
*Plane, plane; Ship, ship; ST, storage tank; BD, baseball diamond; TC, tennis court; BC, basketball court; GTF, ground track field; Harbor, harbor; Bridge, bridge; LV, large vehicle; SV, small vehicle; HC, helicopter; RA, roundabout; SF, soccer field; SP, swimming pool.
The best performancehighlighted in bold, and the italic numbers representing the second best performance.
To quantitatively dissect the contribution of each proposed module, we conduct a comprehensive ablation study following a progressive integration path, with results summarized in Table 2. Case 1: Baseline - no any modules; Case 2: adding AQE; Case 3: adding AQE and Double-A Loss; Case 4: adding AQE, Double-A Loss and NMS-AQ (Ours).
Table 2.
Ablation results of different modules on DOTA validation set (mAP and mAP75).
| Module | mAP (%) | mAP75 (%) | ΔmAP75 (vs. Baseline) |
|---|---|---|---|
| Case 1: Baseline | 71.47 | 35.28 | |
| Case 2: Baseline + AQE | 72.49 | 38.24 | + 2.96 |
| Case 3: Baseline + AQE + Double-A Loss | 73.58 | 38.94 | + 3.66 |
| Case 4: Baseline + AQE + Double-A Loss + NMS-AQ (Ours) | 73.73 | 39.97 | + 4.69 |
Contribution of Angle Quality Estimation (AQE): Comparing Case 2 (AQE only) with the Baseline (Case 1), we observe a significant gain of 2.96% in mAP75. This substantial improvement under a high IoU threshold underscores that modeling angle as a periodic distribution fundamentally enhances angle prediction accuracy, which is the cornerstone for high-precision localization.
Synergy between AQE and Double-A Loss: When both AQE and Double-A Loss are employed (Case 3), the performance (38.94% mAP75) surpasses the sum of their individual gains, achieving a synergistic improvement of + 3.66% over the baseline. This indicates that high-quality angle prediction (AQE) and aspect-ratio-aware supervision (Double-A Loss) are complementary and mutually reinforcing.
Contribution of NMS-AQ: Finally, integrating the NMS-AQ module (Case 4, Full Model) pushes the mAP75 to 39.97%, an overall gain of 4.69%. This final step demonstrates that leveraging the angle quality score to refine the post-processing stage effectively selects detection boxes with more reliable orientations, thereby translating accurate predictions into superior final results.
Figure 2 presents the ablation experiments on the HRSC2016 dataset. According to the table, compared to the benchmark model Faster-RCNN, the insertion angle quality estimation module significantly improves the detection performance of ship objects, achieving an 8.31% improvement in the mAP. Ship objects usually have higher aspect ratios, so precise detection of these objects requires higher accuracy in angle prediction. Data augmentation alone, while beneficial, does not specifically address the core challenges of angle prediction periodicity or the mismatch between classification and localization confidence. Thus, the improvement in mAP75 (from 38.92% to 50.04%) is more modest compared to the full model with AQE and other modules, which explicitly enhance angle estimation and localization accuracy. This highlights the necessity of our proposed modules for high-precision oriented object detection. Due to the improved optimization process of the angle prediction branch in this method, its detection performance has been significantly improved. At the same time, the improvement it brings increases with the improvement of detection standards. Under the more accurate detection standard mAP75, a performance improvement of 11.20% can be achieved.
Fig. 2.

Trend of models with different modules on the HRSC2016 test set.
To analyze the cost introduced by the angle quality estimation module, experimental comparisons were conducted in Table 3 between the angle regression based method (RetinaNet)34, the angle classification based method (CSL)40, our own method (AQE-Detector), and other methods. Among them, RetinaNet34 was selected as the baseline model, with an input image size of 800 × 800. As illustrated in Table 3, the proposed AQE-Detector achieves an excellent balance between accuracy and efficiency. Parameter Efficiency: AQE-Detector introduces a very lightweight Angle Quality Estimation branch. Consequently, AQE-Detector has a highly competitive number of parameters (36.15 M), which is comparable to the lightweight RetinaNet40 (36.13 M) and significantly lower than other complex detectors like Faster R-CNN (41.55 M), RoI-Transformer (43.82 M), and ReDet (45.66 M). This demonstrates the parameter efficiency of our design. Computational Cost: GFLOPs of AQE-Detector (128.37G) are comparable to Faster R-CNN and significantly lower than more complex detectors like RoI-Transformer and ReDet. This indicates that our added modules incur negligible additional computational burden during the forward pass.
Table 3.
Comparison of model parameters and training costs on the HRSC2016 test set.
| Methods | Model parameter (MB) | GFLOPs | Training time/iteration | mAP | mAP75 |
|---|---|---|---|---|---|
| Faster R-CNN18 | 41.55 | 137.42 | 0.3592 | 72.47 | 56.81 |
| RoI-Transformer41 | 43.82 | 225.67 | 0.3700 | 81.54 | 62.44 |
| ReDet42 | 45.66 | 239.23 | 0.4833 | 76.52 | 70.28 |
| RetinaNet34 | 36.13 | 128.09 | 0.2495 | 83.21 | 50.72 |
| CSL40 | 40.15 | 181.79 | 0.3870 | 85.40 | 60.20 |
| AQE-Detector | 36.15 | 128.37 | 0.3468 | 86.28 | 72.32 |
In summary, the AQE-Detector provides a significant accuracy improvement over the baseline and other methods while maintaining a model complexity (in terms of parameters and GFLOPs) on par with the most efficient detectors like RetinaNet. This makes it a highly efficient and practical solution for oriented object detection tasks.
-
2.
Effectiveness of angle loss function based on aspect-ratio perception (Double-A Loss).
As shown in Table 1, introducing an angle loss function based on aspect-ratio perception (Double-A Loss) can effectively improve detection performance. The detector improved by 1.09% and 0.70% on the mAP and mAP75 of the DOTA dataset, respectively. This study improved the optimization process of the model and enhanced the recognition efficiency of rare low-frequency objects. Figure 2 shows the detection results on the HRSC2016 dataset. Introducing the new loss function based on aspect-ratio perception on the basis of the angle quality estimation module can bring additional improvements of 0.77% and 4.34% on the mAP and mAP75 metrics, demonstrating its effectiveness.
-
3.
Effectiveness of Non-Maximum Suppression based on Angle Quality (NMS-AQ).
The estimated angle quality is used to perform post-processing operations on the model detection results, comprehensively evaluating the quality of each detection bounding box. Table 1 indicate that the non-maximum suppression based on angle quality (NMS-AQ) can effectively improve the detection performance, especially for the high-precision mAP75, the use of NMS-AQ brought a 1.03% improvement. Figure 2 indicate that the introduction of NMS-AQ improves the model performance on the HRSC2016 dataset, achieving gains of 0.92% and 4.01% on the mAP and mAP75, respectively.
Finally, all the modules mentioned above were integrated. As shown in Fig. 2, without adding any data augmentation, the AQE-Detector proposed achieved a 19.55% improvement in mAP75 on the HRSC2016 dataset. Despite using the data augmentation techniques, AQE-Detector still achieved a significant improvement of 25.08% on the mAP75, and AQE-Detector can further enhance its high-precision detection performance. For the DOTA dataset, AQE-Detector improved the mAP and mAP75 by 2.26% and 4.89% respectively, indicating the effectiveness for oriented objects of any category. For further comparison, the visualizations were visualized in Figs. 3 and 4, indicating that AQE-Detector can bring significant advantages. As shown, Faster R-CNN and RoI-Transformer produces inaccurate oriented boxes, particularly for densely packed and elongated objects like ships. With the introduction of the AQE module, the angle predictions become significantly more precise, leading to better-aligned bounding boxes. The integration of the Double-A Loss further refines the detection, especially for angle-sensitive high-aspect-ratio objects. Finally, our full model with NMS-AQ demonstrates the best performance, effectively suppressing duplicate detections while retaining the most accurately oriented boxes, resulting in the cleanest and most precise detection results.
Fig. 3.
Visualizations of ablation experiments on DOTA validation set. The base image is based on DOTA dataset and visualized using Python 3.7, OpenCV 4.5.4 (https://opencv.org), and Matplotlib 3.5.1 (https://matplotlib.org) software.
Fig. 4.
Visualizations of ablation experiments on the HRSC2016 test set. The base image is based on HRSC2016 dataset and visualized using Python 3.7, OpenCV 4.5.4 (https://opencv.org), and Matplotlib 3.5.1 (https://matplotlib.org) software.
Parameter analysis
Hyperparameter analysis of angle resolution.
The impact of angular resolution on model detection was compared under different angular resolution settings. Setting the angular resolution Rθ to 10, 5, and 1 respectively, Table 4 shows the model detection at different Rθ. According to Table 4, when the angular resolution is set to 10, the iteration time is only 0.3440s. Due to the low angular resolution allowing the model to make ambiguous predictions, the detection model achieved the lowest mAP accuracy. When the angular resolution is set to 5, the mAP and mAP75 increase to 72.64% and 37.50%, respectively. Due to the low computational cost introduced by the AQE during the training phase, it has almost no impact on the time cost. The model maintains an approximate training time at an angular resolution of 1, indicating that this method can ensure overall efficiency with almost no increase in time cost.
Table 4.
Parameter analysis of AQE on DOTA validation set.
| Method | R θ | Training time/iteration | mAP | mAP75 |
|---|---|---|---|---|
| AQE-Detector | 10 | 0.3440 | 72.29 | 33.04 |
| 5 | 0.3442 | 72.64 | 37.50 | |
| 1 | 0.3468 | 73.73 | 39.97 |
-
2.
Hyperparameter analysis of NMS-AQ.
ξ is crucial in non-maximum suppression based on angle quality (NMS-AQ), and is set to 0, 0.05, 0.1, 0.2, and 0.5, respectively. Table 5 shows that when ξ is set to 0.1, this method reaches the optimal detection accuracy. After introducing NMS-AQ, the mAP increased by 0.15%, and the mAP75 increased by 0.66%. Overall, with the change of ξ, the mAP75 consistently outperforms the model without NMS-AQ, verifying its effectiveness.
Table 5.
Parameter analysis of NMS-AQ on DOTA validation set.
| Method | NMS-AQ | ξ | mAP | mAP75 |
|---|---|---|---|---|
| AQE-Detector | × | 0 | 73.58 | 38.94 |
| √ | 0.05 | 73.69 | 39.60 | |
| √ | 0.1 | 73.73 | 39.97 | |
| √ | 0.2 | 73.58 | 39.75 | |
| √ | 0.5 | 72.63 | 39.58 |
-
3.
Hyperparameter analysis of Double-A Loss.
Angle sensitivity function and weight factor γ are key parameters in the angle loss function. As mentioned earlier, three conversion functions were selected for the angle sensitivity in this experiment. Table 6 shows that the fitted Gaussian function has significant effectiveness in characterizing the relationship between angle sensitivity and object aspect ratio. For the weight factor γ, this experiment sets it to 2, 5, and 10 for comparison. When γ is set to 5, the model achieves a better balance between simple and difficult samples.
Table 6.
Parameter analysis of Double-A loss on DOTA validation set.
| Method | Angle sensitivity function | γ | mAP | mAP75 |
|---|---|---|---|---|
| AQE-Detector | Linear function | 5 | 72.75 | 38.55 |
| Quadratic function | 5 | 73.12 | 37.98 | |
| Gaussian function | 5 | 73.73 | 39.97 | |
| Gaussian function | 2 | 73.20 | 38.78 | |
| Gaussian function | 10 | 73.25 | 39.16 |
Comparison results
This section conducted a comprehensive experimental comparison between our method (AQE-Detector) and currently advanced methods on three datasets: DOTA, HRSC2016, and ICDAR2015. As shown in Table 7, AQE-Detector achieved a detection accuracy of 78.35% on the mAP, significantly better than most other comparison methods. Compared to the baseline model41, this method achieved an improvement of 25.42% on mAP, verifying its superiority. When using RoI-Transformer42, this method achieved 80.87% on mAP, outperforming all existing methods. In addition, compared with the most advanced methods34,43,44, the evaluation indicators of AQE-Detector have improved by 3.33%, 3.02%, and 2.18% in terms of results, respectively. As shown in Fig. 5, this method can identify oriented objects in complex and dense scenes at different scales. In addition, the proposed algorithm can be combined with any existing method to improve its detection performance. As shown in Table 8, AQE-Detector can improve the overall efficiency of RetinaNet34, Faster-RCNN18, RoI-Transformer42, and ReDet43.
Table 7.
Comparative results on DOTA test set.
| Methods | Plane | BD | Bridge | GTF | SV | LV | Ship | TC | BC | ST | SBF | RA | Harbor | SP | HC | mAP |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FR-O | 79.09 | 69.12 | 17.17 | 63.49 | 34.20 | 37.16 | 36.20 | 89.19 | 69.60 | 58.96 | 49.40 | 52.52 | 46.69 | 44.80 | 46.30 | 52.93 |
| R-DFPN | 80.92 | 65.82 | 33.77 | 58.94 | 55.77 | 50.94 | 54.78 | 90.33 | 66.34 | 68.66 | 48.73 | 51.76 | 55.10 | 51.32 | 35.88 | 57.94 |
| R2CNN | 80.94 | 65.67 | 35.34 | 67.44 | 59.92 | 50.91 | 55.81 | 90.67 | 66.92 | 72.39 | 55.06 | 52.23 | 55.14 | 53.35 | 48.22 | 60.67 |
| RRPN | 88.52 | 71.20 | 31.66 | 59.30 | 51.85 | 56.19 | 57.25 | 90.81 | 72.84 | 67.38 | 56.69 | 52.84 | 53.08 | 51.94 | 53.58 | 61.01 |
| ICN | 81.40 | 74.30 | 47.70 | 70.30 | 64.90 | 67.80 | 70.00 | 90.80 | 79.10 | 78.20 | 53.60 | 62.90 | 67.00 | 64.20 | 50.20 | 68.20 |
| RoI-Transformer | 88.64 | 78.52 | 43.44 | 75.92 | 68.81 | 73.68 | 83.59 | 90.74 | 77.27 | 81.46 | 58.39 | 53.54 | 62.83 | 58.93 | 47.67 | 69.56 |
| CAD-Net | 87.80 | 82.40 | 49.40 | 73.50 | 71.10 | 63.50 | 76.70 | 90.90 | 79.20 | 73.30 | 48.40 | 60.90 | 62.00 | 67.00 | 62.20 | 69.90 |
| DRN | 88.91 | 80.22 | 43.52 | 63.35 | 73.48 | 70.69 | 84.94 | 90.14 | 83.85 | 84.11 | 50.12 | 58.41 | 67.62 | 68.60 | 52.50 | 70.70 |
| DAL | 88.61 | 79.69 | 46.27 | 70.37 | 65.89 | 76.10 | 78.53 | 90.84 | 79.98 | 78.41 | 58.71 | 62.02 | 69.23 | 71.32 | 60.65 | 71.78 |
| SCRDet | 89.98 | 80.65 | 52.09 | 68.36 | 68.36 | 60.32 | 72.41 | 90.85 | 87.94 | 86.86 | 65.02 | 66.68 | 66.25 | 68.24 | 65.21 | 72.61 |
| R3Det | 89.49 | 81.17 | 50.53 | 66.10 | 70.92 | 78.66 | 78.21 | 90.81 | 85.26 | 84.23 | 61.81 | 63.77 | 68.16 | 69.83 | 67.17 | 73.74 |
| Gliding Vertex | 89.64 | 85.00 | 52.26 | 77.34 | 73.01 | 73.14 | 86.82 | 90.74 | 79.02 | 86.81 | 59.55 | 70.91 | 72.94 | 70.86 | 57.32 | 75.02 |
| Mask OBB | 89.56 | 85.95 | 54.21 | 72.90 | 76.52 | 74.16 | 85.63 | 89.85 | 83.81 | 86.48 | 54.89 | 69.64 | 73.94 | 69.06 | 63.32 | 75.33 |
| CSL | 90.25 | 85.53 | 54.64 | 75.31 | 70.44 | 73.51 | 77.62 | 90.84 | 86.15 | 86.69 | 69.60 | 68.04 | 73.83 | 71.10 | 68.93 | 76.17 |
| DCL | 89.14 | 83.93 | 53.05 | 72.55 | 78.13 | 81.97 | 86.94 | 90.36 | 85.98 | 86.94 | 66.19 | 65.66 | 73.72 | 71.53 | 68.69 | 76.97 |
| ReDet | 88.81 | 82.48 | 60.83 | 80.82 | 78.34 | 86.06 | 88.31 | 90.87 | 88.77 | 87.03 | 68.65 | 66.90 | 79.26 | 79.71 | 74.67 | 80.10 |
| AQE-Detector | 89.49 | 85.89 | 55.46 | 77.32 | 74.17 | 80.33 | 87.65 | 90.82 | 86.99 | 86.52 | 66.58 | 69.06 | 77.12 | 78.35 | 69.51 | 78.35 |
|
RoI-Transformer +AQE-Detector |
89.23 | 85.35 | 60.26 | 80.13 | 80.27 | 85.58 | 88.46 | 90.88 | 87.51 | 88.07 | 69.80 | 70.32 | 80.97 | 82.16 | 74.03 | 80.87 |
Fig. 5.
Visualization on DOTA test set. The base image is based on DOTA dataset and visualized using Python 3.7, OpenCV 4.5.4 (https://opencv.org), and Matplotlib 3.5.1 (https://matplotlib.org) software.
Table 8.
Algorithm transferability analysis.
| Methods | Networks | AQE-Detector | mAP | mAP75 |
|---|---|---|---|---|
| RetinaNet34 | ResNet−50 | × | 64.31 | 36.40 |
| √ | 66.86 (+ 2.55) | 38.04 (+ 1.64) | ||
| Faster-RCNN18 | ResNet−50 | × | 71.47 | 35.28 |
| √ | 73.73 (+ 2.26) | 39.97 (+ 4.69) | ||
| RoI-Transformer42 | ResNet−50 | × | 75.54 | 47.02 |
| √ | 76.31 (+ 0.77) | 48.33 (+ 1.31) | ||
| ReDet43 | ReResNet−50 | × | 76.52 | 51.73 |
| √ | 76.81 (+ 0.29) | 52.03 (+ 0.30) |
The Double-A Loss adaptively adjusts the supervision intensity based on an object’s aspect ratio. This design is particularly beneficial for objects whose localization accuracy is highly sensitive to angle prediction. Categories with extremely high or low aspect ratios, such as Large Vehicle (LV), Ship, and Harbor, show consistent improvements. For instance, Helicopter (HC) achieves a remarkable gain of 5.96% in mAP with Double-A Loss. This can be attributed to the fact that helicopters, while sometimes appearing near-square, often have distinct orientations due to their tail boom, making them moderately angle-sensitive. The Double-A Loss provides more nuanced supervision than a one-size-fits-all loss, allowing the model to learn more accurate angles for such targets. Conversely, for categories with more balanced aspect ratios like Storage Tank (ST) or Tennis Court (TC), which are less sensitive to angle errors, the gains from Double-A Loss are smaller, as expected. This confirms that our method intelligently allocates learning resources.
The gains from AQ-NMS are most pronounced in cluttered scenes containing multiple instances of elongated objects. For example, in the Ship and Large Vehicle (LV) categories, where objects are densely packed and have high aspect ratios, a small angle error can lead to a significant drop in IoU. AQ-NMS effectively prioritizes boxes with more reliable angle predictions, leading to noticeable improvements in mAP75 (e.g., + 2.05% for Ship). Conversely, for isolated objects or categories with more compact shapes (e.g., Basketball Court (BC) or Baseball Diamond (BD)), the classification confidence and IoU are already strong indicators of box quality. In these cases, the angle quality provides less additional information, resulting in more limited gains from AQ-NMS. This is not a failure but rather an expected behavior, demonstrating that AQ-NMS selectively provides the most value in complex, dense scenarios where angle precision is critical.
Table 9 shows the overall comparison of different models on the HRSC2016 dataset. Proposed AQE-Detector achieved an accuracy of 90.02% in the mAP, significantly better than the baseline model by 34.32%, demonstrating the detection advantage of AQE-Detector for larger aspect-ratio objects. In addition, compared with the most advanced methods40,42,43,45,46, this method has improved by 3.82%, 1.82%, 0.76%, 0.45%, and 0.40%, respectively. Figure 6 visually illustrates the performance of AQE-Detector, as most ships have a large aspect ratio, have accurate recognition.
Table 9.
Comparative results on HRSC2016.
| Methods | Size of image input | mAP |
|---|---|---|
| Fast-RCNN + SRBBS | – | 55.70 |
| BL2 | – | 69.60 |
| R2CNN | 800 × 800 | 73.07 |
| IENet | 1024 × 1024 | 75.01 |
| RC1&RC2 | – | 75.70 |
| RRPN | 800 × 800 | 79.05 |
| RRD | 384 × 384 | 84.30 |
| RoI-Transformer | 512 × 800 | 86.20 |
| Gliding Vertex | – | 88.20 |
| R3Det | 800 × 800 | 89.26 |
| DCL | 800 × 800 | 89.46 |
| GRS-Det | 800 × 800 | 89.57 |
| CSL | 800 × 800 | 89.62 |
| DAL | 800 × 800 | 89.77 |
| AQE-Detector | 800 × 800 | 90.02 |
Fig. 6.
Visualization on the HRSC2016 test set. The base image is based on HRSC2016 dataset and visualized using Python 3.7, OpenCV 4.5.4 (https://opencv.org), and Matplotlib 3.5.1 (https://matplotlib.org) software.
The proposed AQE-Detector was evaluated on ICDAR2015. This benchmark contains many directed texts and complex backgrounds. To demonstrate the performance of AQE-Detector, it was compared with SOTAs40,47,48 and some common text detection methods49–53. As shown in Table 10, in terms of F1 Measure, this method achieved 84.58% performance, significantly better than other methods. Compared with SCRDet47, this method improved F1 Measure by 4.50%. Compared with DAL48 and CSL40, this method still has good performance when facing oriented text objects and complex backgrounds.
Table 10.
Comparative results on ICDAR2015.
| Methods | Precision | Recall | F1-Measure |
|---|---|---|---|
| CTPN | 74.22 | 51.56 | 60.85 |
| Seglink | 73.10 | 76.80 | 75.00 |
| RRPN | 82.17 | 73.23 | 77.44 |
| EAST | 78.33 | 83.27 | 80.72 |
| SCRDet | 81.30 | 78.90 | 80.08 |
| RRD | 85.60 | 79.00 | 82.20 |
| DAL | 84.40 | 80.50 | 82.40 |
| CSL | 84.30 | 83.00 | 83.65 |
| AQE-Det | 85.65 | 83.53 | 84.58 |
Conclusion
A new oriented object detection technique based on angle quality estimation (AQE-detector) was proposed to solve the problems of missing confidence and insufficient localization accuracy in object detection for remote sensing, effectively improving the high-precision detection performance of remote sensing objects. Using periodic Gaussian distribution to model angle variables, the detection task was transformed into a distribution estimation task, overcoming the problem of inconsistent loss function values and object localization results caused by angle periodicity, while implicitly estimating the confidence information of the predicted angle. Non-maximum suppression based on angle quality (NMS-AQ) was proposed. By comprehensively considering angle confidence and classification confidence, the bias in evaluating detection results was reduced. In addition, the relationship between the angle sensitivity and aspect-ratio was quantitatively analyzed, and an angle loss function based on aspect-ratio perception (Double-A Loss) was constructed to flexibly adjust the loss weight. The intelligent object detection technology under information limited conditions has been studied, facilitating the efficient application of object detection in major demand fields.
In future work, in-depth research will be conducted from the following aspects. (1) Object detection based on cross modal information. This study only uses single modal image data, and lacks the comprehensive utilization of multimodal image information. How to fully explore the complementary information contained in different modal data is crucial for improving object detection performance. (2) Sustainable detection based on mixed supervision information. The actual application scenarios are usually dynamically open, and target data with new categories and appearances will emerge over time. Object detection models based on single training and lifelong use are difficult to meet the ever-changing practical needs. Therefore, it is crucial to research a sustainable object detection method that can be updated online and learn incrementally. (3) Benchmarking and performance validation with the latest methods. In response to the potential obsolescence of the comparative methods used in this study, we plan to systematically conduct large-scale comparative experiments with the latest advanced methods published in 2024–2025 in future work to further validate the competitiveness and generality of AQE detector. This will include performance evaluation and ablation studies on multiple datasets to ensure that our approach can continuously address the challenges of field development.
Acknowledgments
Thanks for the hard work of the editors and the constructive suggestions of the anonymous reviewers. And thank the teachers and students who provide help for the data processing and the study design of this article.
Author contributions
C. Li: Writing - origin draft, Investigation, Data curation, Formal analysis, Conceptualization, Resources. All authors reviewed the manuscript.
Funding
No Funding.
Data availability
The datasets used in this article are public datasets.The DOTA dataset is available at https://captain-whu.github.io/DOTA.The HRSC2016 dataset is available at https://aistudio.baidu.com/datasetdetail/54106.The ICDAR 2015 dataset is available at http://www.robots.ox.ac.uk/~vgg/data/text/.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Mehmood, M., Shahzad, A., Zafar, B., Shabbir, A. & Ali, N. Remote sensing image classification: A comprehensive review and applications. Math. Probl. Eng.2022 (1), 5880959 (2022). [Google Scholar]
- 2.Wang, Y., Bashir, S. M. A., Khan, M., Ullah, Q., Wang, R., Song, Y., & Niu, Y. Remote sensing image super-resolution and object detection: Benchmark and state of the art. Exp. Syst. Appli. 197, 116793 (2022).
- 3.Wenqi, Y. U. et al. MAR20: A benchmark for military aircraft recognition in remote sensing images. Natl. Remote Sens. Bull.27 (12), 2688–2696 (2024). [Google Scholar]
- 4.Jianya, G., Haigang, S., Guorui, M. & Qiming, Z. A review of multi-temporal remote sensing data change detection algorithms. Int. Archives Photogrammetry Remote Sens. Spat. Inform. Sci.37 (B7), 757–762 (2008). [Google Scholar]
- 5.Topouzelis, K., Papageorgiou, D., Suaria, G. & Aliani, S. Floating marine litter detection algorithms and techniques using optical remote sensing data: A review. Mar. Pollut. Bull.170, 112675 (2021). [DOI] [PubMed] [Google Scholar]
- 6.Yi, H., Liu, B., Zhao, B. & Liu, E. Small object detection algorithm based on improved YOLOv8 for remote sensing. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens.17, 1734–1747 (2023). [Google Scholar]
- 7.Liu, Y. et al. Dual-Perspective alignment learning for multimodal remote sensing object Detection[J]. IEEE Trans. Geosci. Remote Sens.63, 5404015 (2025). [Google Scholar]
- 8.Sun, F., Li, H., Liu, Z., Li, X. & Wu, Z. Arbitrary-angle bounding box based location for object detection in remote sensing image. Eur. J. Remote Sens.54 (1), 102–116 (2021). [Google Scholar]
- 9.Zhou, L. et al. Arbitrary-oriented object detection in remote sensing images based on Polar coordinates. IEEE Access.8, 223373–223384 (2020). [Google Scholar]
- 10.Zand, M., Etemad, A. & Greenspan, M. Oriented bounding boxes for small and freely rotated objects. IEEE Trans. Geosci. Remote Sens.60, 1–15 (2021). [Google Scholar]
- 11.Wei, C., Ni, W., Qin, Y., Wu, J., Zhang, H., Liu, Q., & Bian, H. Ridop: A rotation-invariant detector with simple oriented proposals in remote sensing images.Remote Sens.15 (3), 594 (2023).
- 12.Liu, Y. et al. ABNet: adaptive balanced network for multiscale object detection in remote sensing imagery[J]. IEEE Trans. Geosci. Remote Sens.60, 1–14 (2021). [Google Scholar]
- 13.Wang, G. et al. High-quality angle prediction for oriented object detection in remote sensing images. IEEE Trans. Geosci. Remote Sens.61, 1–14 (2023). [Google Scholar]
- 14.Song, B., Li, J., Wu, J., Chang, J. & Wan, J. Direction prediction redefinition: transfer angle to scale in oriented object detection. IEEE Trans. Circuits Syst. Video Technol.34 (12), 12894–12906 (2024). [Google Scholar]
- 15.Liu, C., Zhang, S., Hu, M. & Song, Q. Object detection in remote sensing images based on adaptive multi-scale feature fusion method. Remote Sens.16 (5), 907 (2024). [Google Scholar]
- 16.Jin, C., Zheng, A., Wu, Z. & Tong, C. Transformer-Based Multi-layer feature aggregation and rotated anchor matching for oriented object detection in remote sensing images. Arab. J. Sci. Eng.49 (9), 12935–12951 (2024). [Google Scholar]
- 17.Ravi, N. & El-Sharkawy, M. Addressing the gaps of IoU loss in 3D object detection with IIoU. Future Internet. 15 (12), 399 (2023). [Google Scholar]
- 18.Mandi, J. et al. Decision-focused learning: Foundations, state of the art, benchmark and future opportunities. J. Artif. Intell. Res.80, 1623–1701 (2024). [Google Scholar]
- 19.Zheng, S., Wu, Z., Du, Q., Xu, Y. & Wei, Z. Oriented object detection for remote sensing images via object-wise rotation-invariant semantic representation. IEEE Trans. Geosci. Remote Sens.62, 1–15 (2024). [Google Scholar]
- 20.Chen, Z., Xiong, B., Chen, X., Min, G. & Li, J. Joint computation offloading and resource allocation in multi-edge smart communities with personalized federated deep reinforcement learning. IEEE Trans. Mob. Comput.23 (12), 11604–11619 (2024). [Google Scholar]
- 21.Xu, C., Su, H., Gao, L., Wu, J. & Yan, W. Rotated SAR ship detection based on Gaussian Wasserstein distance loss. Mob. Networks Appl.28 (5), 1842–1851 (2023). [Google Scholar]
- 22.Li, P. & Zhu, C. Ro-YOLOv5: one new detector for impurity in wheat based on circular smooth label. Crop Prot.184, 106806 (2024). [Google Scholar]
- 23.Poitras, I. et al. Validity and reliability of wearable sensors for joint angle estimation: A systematic review. Sensors19 (7), 1555 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Yueh, S. H., Higgins, F., Lin, Z., Todhunter, R. J. & Zhang, Y. Diffusion data augmentation for enhancing Norberg hip angle Estimation. Vet. Radiol. Ultrasound66 (1), e13463. (2025). [DOI] [PubMed]
- 25.Barzegar Khanghah, A., Fernie, G. & Roshan Fekr, A. Joint angle Estimation during shoulder abduction exercise using contactless technology. Biomed. Eng. Online. 23 (1), 11 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Snyder, C., Martinez, A., Strutzenberger, G. & Stöggl, T. Connected skiing: validation of edge angle and radial force Estimation as motion quality parameters during alpine skiing. Eur. J. Sport Sci.22 (10), 1484–1492 (2022). [DOI] [PubMed] [Google Scholar]
- 27.Zaghari, N., Fathy, M., Jameii, S. M. & Shahverdy, M. The improvement in obstacle detection in autonomous vehicles using YOLO non-maximum suppression fuzzy algorithm. J. Supercomput.77 (11), 13421–13446 (2021). [Google Scholar]
- 28.Li, B., Song, S. & Ai, L. Rethinking the Non-Maximum suppression step in 3D object detection from a Bird’s-Eye view. Electronics13 (20), 4034 (2024). [Google Scholar]
- 29.Chen, Y. et al. Automated alzheimer’s disease classification using deep learning models with Soft-NMS and improved ResNet50 integration. J. Radiation Res. Appl. Sci.17 (1), 100782 (2024). [Google Scholar]
- 30.Wu, S., Li, X. & Wang, X. IoU-aware single-stage object detector for accurate localization. Image Vis. Comput.97, 103911 (2020). [Google Scholar]
- 31.Wang, L., Mu, X., Ma, C. & Zhang, J. Hausdorff Iou and context maximum selection nms: improving object detection in remote sensing images with a novel metric and postprocessing module. IEEE Geosci. Remote Sens. Lett.19, 1–5 (2021). [Google Scholar]
- 32.Hu, S. et al. Improving YOLOv7-tiny for infrared and visible light image object detection on drones. Remote Sens.15 (13), 3214 (2023). [Google Scholar]
- 33.Li, W. et al. Ellipse IoU loss: better learning for rotated bounding box regression. IEEE Geosci. Remote Sens. Lett.21, 1–5 (2023). [Google Scholar]
- 34.Lin, T. Y., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision 2980–2988 (2017).
- 35.Zhang, Q., Bao, X., Sun, S. & Lin, F. Lightweight network for small target fall detection based on feature fusion and dynamic Convolution. J. Real-Time Image Proc.21 (1), 17 (2024). [Google Scholar]
- 36.Min, H., Rahmani, A. M., Ghaderkourehpaz, P., Moghaddasi, K. & Hosseinzadeh, M. A joint optimization of resource allocation management and multi-task offloading in high-mobility vehicular multi-access edge computing networks. Ad Hoc Netw.166, 103656 (2025). [Google Scholar]
- 37.Luo, J., Hu, Y. & Li, J. Surround-net: A multi-branch arbitrary-oriented detector for remote sensing. Remote Sens.14 (7), 1751 (2022). [Google Scholar]
- 38.Liu, Z., Yuan, L., Weng, L. & Yang, Y. A high resolution optical satellite image dataset for ship recognition and some new baselines. In International Conference on Pattern Recognition Applications and Methods vol. 2, 324–331 (SciTePress, 2017 February).
- 39.Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., & Valveny, E. ICDAR 2015 competition on robust reading. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR) 1156–1160 (IEEE, 2015 August).
- 40.Yang, X. & Yan, J. Arbitrary-oriented object detection with circular smooth label. In European Conference on Computer Vision 677–694 (Cham, Springer International Publishing, 2020 August).
- 41.Xia, G. S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., & Zhang, L. DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 3974–3983 (2018).
- 42.Ding, J., Xue, N., Long, Y., Xia, G. S. & Lu, Q. Learning RoI transformer for oriented object detection in aerial images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2849–2858 (2019).
- 43.Han, J., Ding, J., Xue, N. & Xia, G. S. Redet: A rotation-equivariant detector for aerial object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2786–2795 (2021).
- 44.Xu, Y. et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans. Pattern Anal. Mach. Intell.43 (4), 1452–1459 (2020). [DOI] [PubMed] [Google Scholar]
- 45.Yang, X., Yan, J., Feng, Z. & He, T. R3det: Refined single-stage detector with feature refinement for rotating object. In Proceedings of the AAAI Conference on Artificial Intelligence vol. 35, No. 4, 3163–3171 (2021 May).
- 46.Zhang, X. et al. GRS-Det: an anchor-free rotation ship detector based on Gaussian-mask in remote sensing images. IEEE Trans. Geosci. Remote Sens.59 (4), 3518–3531 (2020). [Google Scholar]
- 47.Yang, X., Yang, J., Yan, J., Zhang, Y., Zhang, T., Guo, Z., & Fu, K. Scrdet:Towards more robust detection for small, cluttered and rotated objects. In Proceedings of the IEEE/CVF International Conference on Computer Vision 8232–8241 (2019).
- 48.Ming, Q., Zhou, Z., Miao, L., Zhang, H. & Li, L. Dynamic anchor learning for arbitrary-oriented object detection. In Proceedings of the AAAI Conference on Artificial Intelligence vol. 35, No. 3, 2355–2363 (2021 May).
- 49.Ma, J. et al. Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multimedia. 20 (11), 3111–3122 (2018). [Google Scholar]
- 50.Liao, M., Zhu, Z., Shi, B., Xia, G. S. & Bai, X. Rotation-sensitive regression for oriented scene text detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 5909–5918 (2018).
- 51.Zhou, X. et al. East: an efficient and accurate scene text detector. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 5551–5560 (2017).
- 52.Tian, Z., Huang, W., He, T., He, P. & Qiao, Y. Detecting text in natural image with connectionist text proposal network. In European Conference on Computer Vision 56–72 (Cham, Springer International Publishing, 2016 September).
- 53.Shi, B., Bai, X. & Belongie, S. Detecting oriented text in natural images by linking segments. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2550–2558 (2017).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets used in this article are public datasets.The DOTA dataset is available at https://captain-whu.github.io/DOTA.The HRSC2016 dataset is available at https://aistudio.baidu.com/datasetdetail/54106.The ICDAR 2015 dataset is available at http://www.robots.ox.ac.uk/~vgg/data/text/.























