Enhanced pedestrian walkway object detection using deep learning and pelican optimization algorithm for assisting disabled persons

Fadwa Alrowais; Mona Almofarreh; Radwa Marzouk

doi:10.1038/s41598-025-32129-0

. 2025 Dec 10;16:2286. doi: 10.1038/s41598-025-32129-0

Enhanced pedestrian walkway object detection using deep learning and pelican optimization algorithm for assisting disabled persons

Fadwa Alrowais ¹, Mona Almofarreh ², Radwa Marzouk ^3,^4,^✉

PMCID: PMC12816733 PMID: 41372505

Abstract

Walking is a significant transportation method, but the convenience of pedestrian surroundings for individuals with blindness is highly challenging. Pedestrians with blindness familiarize themselves with guidelines in their surroundings, which might be artificial or natural. To overcome these troubles, it is highly significant for them to perceive the features of an environment. Currently, numerous methods like long white canes and GPS are deployed to improve pedestrian walkways for sightless people. So, they can utilize it as the primary assistive device for recognition and also the vital ecological features for persons with disability. Recently, a growing amount of success has been conveyed for vision navigation tasks depend upon deep learning (DL) and machine learning (ML) networks to aid visually impaired people. This study proposes an Enhanced Pedestrian Walkway Object Detection and Pelican Optimization Algorithm for Assisting Disabled Persons (EPWOD- POAADP) method. The main intention of the EPWOD-POAADP method is to enhance pedestrian walkways for blind people’s navigation. At first, the image pre-processing stage applies median filtering (MF) to eliminate the noise in the input data. Furthermore, the Faster R-CNN model is employed for the object detection process to identify and locate objects within an image. The CapsNet model is used for the feature extraction process. In addition, the wavelet neural network (WNN) technique is implemented for the detection and classification process. Finally, the hyperparameter selection of the WNN model is performed using the pelican optimization algorithm (POA) technique. The experimental evaluation of the EPWOD-POAADP approach is examined under the UCSD anomaly detection dataset. The outcomes indicated the enhanced performance of the EPWOD-POAADP approach compared to recent approaches.

Keywords: Pedestrian walkway, Object detection, Pelican optimization algorithm, Disabled persons, Faster R-CNN

Subject terms: Diseases, Health care, Mathematics and computing

Introduction

World health organization (WHO) states that at least one billion individuals will be blind in 2020. It is mainly affected by age-related cataracts, neurological defects from birth, and uncorrected refractive errors¹. For those who are blind, either confidence or independence to undertake everyday living routines was affected². People determined by visual ailments and deficiencies require support to triumph through daily assignments, like exploring and moving to unknown settings. Despite several developments in innovation, blindness endures a significant challenge³. Usually, pedestrians with visual impairments disregard much data about their instant setting that sighted people might take without proof⁴. Whereas multiple experts are recompensing for missed data over improved awareness of other gestures and the utilization of navigational assistance, either lower-tech, for instance, guide dogs or white canes, or higher-tech, for example, GPS gadgets, there are still multiple circumstances in which people with visual impairments are unable to travel individually they would like⁵.

For people with visual impairments, moving to a novel setting might be a specific difference in proficiency⁶. Consequently, while travellers with visual impairments search unknown targets, they are frequently required to plan forward widely to memorize and attain directions, and several may seek support from others comprising family members, friends, and specialized trainers to inform themselves of an unknown location⁷. While moving rather known routes, managing sudden necessities in a journey, like finding a drink, food, or toilet, could be challenging. Primarily, every requirement could involve mastery of a further path, and it might not be very easy to predict each route one may need to know to improve⁸. Investigators are aiming at this concern to advance assistants or supportive gadgets for visually impaired people (VIPs). Nowadays, multiple computer vision (CV) depends on jobs modelled by aiming at processes like data acquisition, feature extraction, and behavioural learning⁹. Deep Learning (DL) and Machine Learning (ML) relate to a field of Artificial Intelligence (AI) that employs statistical models to learn unseen patterns from dominant information and to make decisions in terms of unnoticed registers, where DL and ML-based models are effective assistive methods to assist visually impaired walking outdoors and indoors¹⁰.

This study proposes an Enhanced Pedestrian Walkway Object Detection and Pelican Optimization Algorithm for Assisting Disabled Persons (EPWOD-POAADP) method. The main intention of the EPWOD-POAADP method is to enhance pedestrian walkways for blind people’s navigation. At first, the image pre-processing stage applies median filtering (MF) to eliminate the noise in the input data. Furthermore, the Faster R-CNN model is employed for the object detection process to identify and locate objects within an image. The CapsNet model is used for the feature extraction process. In addition, the wavelet neural network (WNN) technique is implemented for the detection and classification process. Finally, the hyperparameter selection of the WNN model is performed using the pelican optimization algorithm (POA) technique. The experimental evaluation of the EPWOD-POAADP approach is examined using a benchmark image dataset. The major contribution of the EPWOD-POAADP approach is listed below.

The EPWOD-POAADP model initially utilizes MF to eliminate impulse noise and preserve edge details, improving image clarity. This enhances the quality of inputs for subsequent processing stages and strengthens the model’s overall robustness and reliability.
The Faster R-CNN method is integrated into the framework to enable precise and efficient object detection by generating accurate region proposals. This ensures high output in identifying relevant targets within the input images. Its inclusion significantly improves the detection performance of the EPWOD-POAADP technique.
The CapsNet technique is employed for robust feature extraction, effectively preserving the spatial hierarchies and relationships in visual data. This improves the model’s capability of comprehending part-whole relationships and orientation discrepancies. Its integration strengthens the feature representation, resulting in enhanced classification result.
The EPWOD-POAADP approach employs the WNN technique to detect and classify the extracted features, enabling multi-resolution analysis of complex patterns. This improves the method’s capacity to capture time and frequency information, improving detection result and reliability.
The EPWOD-POAADP methodology implements the POA model to tune the WNN’s hyperparameters optimally, improving its learning efficiency and convergence speed. This results in enhanced detection and classification result. The utilization of POA ensures robust and reliable model performance.
Integrating POA-tuned WNN with CapsNet and Faster R-CNN forms a unique hybrid architecture that effectually integrates robust feature extraction, precise object detection, and optimized classification. This synergy enhances the efficiency in detection tasks. The novel method utilizes the strengths of each component, resulting in a robust and optimized solution.

Literature of works

Bhatlawande et al.¹¹ proposed a model for aiding visually impaired people (VIP) by classifying and detecting succeeding difficulties in pedestrians and vehicles on the way. While walking on pathways or roads, VIPs have inadequate admittance to data about their settings; thus, identifying succeeding cars or pedestrians is crucial for their safety. Walking from one position to another is one of the most complicated jobs for VIPs. Trained dogs and white canes are the most frequently employed instruments to help VIPs navigate and travel. Kumar et al.¹² projected an obstacle recognition structure compounding a road object detection method and a road anomaly recognition method, utilizing a parallel process for rapid real-world implementation. These techniques depend upon CNN backbones, utilize TL, and are skilled in custom datasets gathered physically in unorganized settings. Adi et al.¹³ intended to design and evaluate disability-friendly pedestrian pathways for safety and optimum availability in Indonesia. A pedestrian pathways technique was obtained by utilizing a data model triangulation. In¹⁴, a new approach to determining the ground impedance under only a single shoe is projected in this manuscript. These models utilize bipolar electrodes to terminate the leakage existing from the body. A finite element analysis (FEA) methodology is implemented to exhibit the bi-polar electrode benefits through unipolar electrodes. A laboratory, testing area, and error analysis are accomplished on the advanced prototype to see the method’s utility.

Hamadi and Latoui¹⁵ introduced an innovative and favourable indoor localization solution to address the restrictions of either SLAM or PDR over their synergistic incorporation. In reality, to precise the cumulative errors of the developed localization method and consequently enhance the precision. Yoshikawa and Premachandra¹⁶ projected an automated sensing model for pedestrian crossings that employs images from cameras connected to them. The developed model keeps unique features, allowing it to manage difficult circumstances that conventional models contend with effectively. It outperforms identifying crosswalks even in low-light circumstances at night, while illumination stages might differ. Guo and Shen¹⁷ intended to utilize the Internet of Things (IoT) and other smart gadgets to advance a smart pedestrian crossing that is safer and more beneficial, specifically for the visually impaired and for movement. IoT and other smart gadgets were primarily designated to alert drivers and assist pedestrians effectively. Then, the indication model of the LED light and the rapid process of hearable pedestrian bollards were reshaped to enhance their effectiveness for movement and support of visually impaired individuals. Moreover, virtual reality (VR) was utilized to assume the smart pedestrian crossing. Eventually, a smart space design concept for the smart pedestrian crossing is presented.

The limitations of the existing studies comprise the lack of large-scale real-time testing in dynamic environments and minimal consideration for adaptability across varied terrains or unstructured settings. Most approaches depend on static sensors or specific hardware configurations, mitigating flexibility and scalability. The dependency on CNN and TL models often needs extensive computational resources, which may not suit portable devices. The utilization of FEA and VR is limited to simulation environments with minimal real-world validation. A major research gap is in integrating lightweight, real-time, cross-environment pedestrian support systems for VIPs that merge IoT, adaptive ML models, and environmental context awareness in uncontrolled conditions.

Proposed models

This paper proposes a novel EPWOD-POAADP method. The main aim of the technique is to enhance the pedestrian walkway method for blind people’s navigation. Figure 1 represents the entire flow of the EPWOD-POAADP model.

Fig. 1 — Overall flow of EPWOD-POAADP model.

Stage I: image pre-processing

At first, the image pre-processing stage applies MF to eliminate the noise in the input data¹⁸. This model is chosen for its robust capability to remove impulsive noise while preserving essential edges and details, which is crucial for accurate object detection and classification. Unlike mean filtering, which can blur edges, MF maintains sharp boundaries, improving the quality of input images. Its nonlinear nature makes it particularly effectual against salt-and-pepper noise commonly found in real-world images. Furthermore, MF is computationally efficient and simple to implement, making it appropriate for real-time applications. These merits collectively justify its selection over other smoothing techniques, ensuring improved downstream model performance.

MF is a nonlinear image processing model frequently employed to decrease noise while maintaining limits in images. In assessing pedestrian walkways to help disabled persons, MF aids in improving the quality of input imageries by eliminating unwanted noise, like distortions from low-quality camera sensors or climate conditions. This pre-processing stage confirms that object detection systems can more precisely classify walkway cracks, obstacles, or other problems. By enhancing the clarity of image data, MF helps measure the availability and protection of pedestrian tracks for persons with disabilities, eventually donating to better urban planning and substructure development.

Stage II: object detection

Besides, the Faster R-CNN model is employed for the object detection process to identify and locate objects within an image¹⁹. This model is chosen for its excellent balance between accuracy and speed, making it highly appropriate for real-time applications. This model integrates a region proposal network (RPN) that efficiently produces high-quality region proposals, mitigating computational overhead. This end-to-end architecture allows for joint optimization, improving detection precision. Compared to single-stage detectors such as YOLO or SSD, Faster R-CNN generally attains higher accuracy, particularly for detecting small or overlapping objects. Its robustness in handling intrinsic scenes and varying object scales makes it an ideal choice for precise and reliable detection tasks.

Deeper ConvNets are frequently applied for object detection due to their high precision compared to preceding techniques, namely ResNets, VGGNets, DenseNet, and Inception networks. One famous framework is RCNN, which uses deeper ConvNets to identify object applications (possible regions of interest). Though it attains higher precision, it contracts space and time inadequacies. The method captures longer times and needs a larger storage area as it removes characteristics from all images and preserves them on hard disks. The detection procedure only captures 47 Inline graphic for one image. Faster RCNN considerably enhances the detection speed to 0.3s per image by combining a pooling layer of ROI.

The disadvantage of Fast RCNN is tackled by Faster RCNN, which presents the RPN. This RPN is executed as a complete convolution system, which forecasts object limitations and objectless scores. It attains translation invariance by fastening it with dissimilar ratios and scales. By combining the deeper Inline graphic -16 method, the whole method can effectively carry out the detection and proposal procedure in just 0.2s. This paper recommends an ensemble learning model derived from DL methods for detecting distract drivers. The model attains higher precision by adjusting the Faster RCNN method and removing pose facts from the driver’s posture (97.7% validation precision). The method concentrates on objects straightforwardly related to distraction and computes communicating relations utilizing the connection above union metric. It attains a precision of 92.2%, exceeding Inline graphic -CNN and Faster RCNN. To safeguard its expediency, the paper must assess the real-world performance of the model, reflecting response time and computational efficacy. Another study references an enhanced Faster ‐CNN method for smaller object detection. The model presents new methods for RoI pooling and bounding box regression to deal with positioning deviation problems. This specifies the efficiency of Faster RCNN for smaller object detection. Nevertheless, added investigation is essential to assess its performance on dissimilar domains and objects, considering computational complexity and possible drawbacks.

Stage III: feature extraction

For the feature extraction process, the EPWOD-POAADP model employs CapsNet²⁰. This model is chosen as it effectually preserves spatial hierarchies and part-to-whole relationships in visual data, which conventional CNNs often overlook. Unlike standard CNNs, CapsNet utilizes dynamic routing to maintain orientation and pose data, improving robustness to image transformations and distortions. This results in an enhanced generalization, particularly in intrinsic scenarios where the spatial arrangement of features is crucial. Moreover, CapsNet requires fewer training samples to achieve high accuracy, making it effective in data-scarce environments. Its capability to capture richer feature representations presents a significant advantage over conventional feature extractors. Figure 2 exemplifies the structure of CapsNet.

An NN named CapsNet was recently presented, and it could considerably influence DL, mainly in computer vision (CV). The output and input of the neuron in a traditional CNN are scalars. On the other hand, the vector is handled by the neurons in CapsNet. Therefore, the capsule is otherwise called a vector neuron (VN), and a vector encompasses all essential data concerning the status of the features in the capsule recognition method. After resizing and deleting features, pooling layers of CNN drop numerous essential features. Furthermore, a CNN fails to understand relationships amongst numerous removed features due to the function, which might obtain crucial data that does not appear. CapsNet utilizes squash functions in association with pooling layers. Like a nonlinear function, which captures input using the vector model and resizes data in the unit vector without changing its alignment, this task will not cause some data to get lost. The following encloses a calculated equation for the capsule’s operation,

The prediction vector is represented as Inline graphic , acknowledged by capsule and produced by capsule . This multiplies the weighted matrix by the output of the capsule layer that came before it.

The total product counts Inline graphic and give outcomes in . During CapsNet, capsules were applied instead of conventional CNN neurons, and all input and output units were transformed into vectors. The vector’s orientation designates a specific unit’s influences on the input data. The capsule vector size designates an object’s possible existence in the present input. The activation function of the CNN, or another Inline graphic function, guarantees that the vector length is amongst (,1). Equation (3) can definite the function.

Simultaneously, the capsule’s complete input vector is represented as Inline graphic and the capsule output vector is shown as

The dynamical routing method describes the coupling coefficient Inline graphic in Eq. (10). The softmax function is defined as and. It specifies the prior prospect amongst capsules and . In previous layers, CapsNet applied the parameter to identify relations amongst capsules and . The coupling coefficient is equivalent to complete capsules in a layer, and an initial iteration Inline graphic is set to . Equation (11) was applied to update and . Utilizing the dot product of and , the following equation updates the parameter :

The Inline graphic value will improve after updating utilizing Eq. (11) after the and dot product gives an optimistic outcome. By strengthening the bond among capsules and , greater leads to better, making greater and values. There should be harm to the connection between capsules and when the dot product of Inline graphic and is negative.

Stage IV: pedestrian walkway detection using WNN

Furthermore, the WNN technique is implemented for the detection and classification²¹. This technique was chosen for its robust capability in capturing both time-frequency information and nonlinear relationships within data, which is significant for handling complex patterns in pedestrian environments. This model integrates wavelet transform, effectively analyzing localized features and image variations. This results in an enhanced accuracy in detecting walkways, particularly in noisy or cluttered scenes. Moreover, WNN demonstrates faster convergence and better generalization with fewer parameters, making it computationally efficient and appropriate for real-time applications. Its ability to balance precision and speed presents a clear advantage over other detection models.

A WNN establishes higher learning abilities, quicker convergence rates, and better accuracy than conventional BP neural networks and other feed-forward neural networks. Together with enhanced sensitivity in function calculation and strong fault tolerance, these benefits make WNNs mainly efficient in tackling composite signal denoising tasks. For this paper, the powers of WNNs is utilized by incorporating wavelet transforms’ multiple-scale study with NN’s nonlinear capacity to handle. This hybrid method permits WNNs to adaptively take signal dissimilarities through dissimilar scales, allowing effective processing of either higher- or lower‐frequency elements. During this figure, Inline graphic characterize the input parameters of the WNN, whereas represent the forecast output values. and indicate the corresponding connection weighting between the input and hidden layers (HL) and between the HL and the output layers.

If the sequence of the input signal is Inline graphic , the output equation for the HL is as demonstrated:

In Eq. (6): Inline graphic refers to an output value of the node in the HL. stands for wavelet basis function. , and represents scaling and translation factor The computation equation is demonstrated below:

In Eq. (7): Inline graphic denotes the output value of the HL. means HL node counts. refers to output layer node counts. The WNN typically utilizes the gradient correction model to correct the network weighting and wavelet base function parameters. The correction method is shown below:

Compute the prediction error of WNN:

In Eq. (8), Inline graphic is the predictable output, and refers to the projected output of the WNN.

Correct the weighting of WNN based on the prediction error:

The coefficients of wavelet base functions are modified based on the prediction error Inline graphic :

In Eq. (9)–(11), Inline graphic , and are computed by the networking prediction error. The computation model is as demonstrated:

Whereas Inline graphic refers to the networking rate of learning.

Stage V: POA-based parameter tuning

Finally, the hyperparameter range of the WNN model is performed by implementing the POA method²². This model is chosen for its excellent balance between exploration and exploitation capabilities, effectively searching the hyperparameter space for optimal values. Compared to conventional optimization methods and other metaheuristics, POA illustrates faster convergence and avoids getting trapped in local minima, resulting in improved overall model performance. Its simple yet efficient mechanism allows it to handle complex, multi-dimensional problems with fewer computational resources. Additionally, POA’s adaptability and robustness make it appropriate for tuning parameters in DL models like WNN, ensuring improved accuracy and stability without excessive computational overhead.

All population members specify candidate solutions, and the optimization problem variables are based on their location inside the space. At the starting phase, Eq. (15) specified population members at the upper and lower limits of the problem.

Whereas Inline graphic refers to the value of the variable identified by the candidate solution, stands for population member count, denotes problem variable amount, signifies the number generated at random in the interval and represent the lower and upper limit of problem variables. The hunting tactic is modelled in dual phases, such as the exploration and exploitation stages.

In the exploration stage, the pelicans find the prey and approach it. This theory is mathematically pretended in Eq. (16).

Whereas Inline graphic refers to the novel status of the pelican in the size according to stage 1, stands for the position of prey in the size, and is its value of the objective function. mean a number that is randomly equivalent to 1 or 2 and arbitrarily chosen for all iterations and all members.

In the exploitation stage, once the pelicans reach the water’s surface, they spread their wings and travel near the fish to a shallow region for collection. The pelican’s behaviour in searching is pretended mathematically in Eq. (17).

Whereas Inline graphic stands for the present status of the pelican in the size according to stage 2, denotes constant equivalent to 0.2, epitomizes the neighbourhood radius of represents the iteration counter, and symbolizes maximal iteration counts.

Therefore, POA meets solutions quicker to the global optimum-based and successfully upgrades to reject or accept the novel pelican location. The POA originates from a fitness function (FF) for attaining an enhanced classification performance. It expresses a positive numeral to epitomize the better result of the candidate solution. The classification rate of error reduction was measured as FF. Its mathematical formulation is computed in Eq. (18).

Performance analysis

The performance evaluation of the EPWOD-POAADP methodology is examined using the UCSD anomaly detection dataset. The technique is simulated using Python 3.6.5 on a PC with an i5-8600k, 250GB SSD, GeForce 1050Ti 4GB, 16GB RAM, and 1 TB HDD. Parameters include a learning rate of 0.01, ReLU activation, 50 epochs, 0.5 dropouts, and a batch size of 5. Table 1 represents a detailed description of the dataset.

Table 1.

Details on the dataset.

Dataset	Videos	Frames of Average	Length
“UCSDPed1 (Bikers, small carts, walking across walkways)”	70	201	5 min
“UCSDPed2 (Bikers, small carts, walking across walkways)”	28	163	5 min

Open in a new tab

Table 2; Fig. 3 show the overall comparative results of the EPWOD-POAADP approach with existing methods under the UCSDPed1 dataset²³. The table values implied that the EPWOD-POAADP approach exhibited effective performances. Based on five false positive rates (FPR), the EPWOD-POAADP model has obtained a higher true positive rate (TPR) of 0.7129 while the MPPCA, SF, EADN, and ADPW-FLHHO models achieved lesser TPR of 0.0915, 0.1315, 0.3466, and 0.5958. Followed by, depending on 15 FPR, the EPWOD-POAADP technique gained a better TPR of 0.8906 whereas the MPPCA, SF, EADN, and ADPW-FLHHO models attained a lower TPR of 0.3517, 0.3676, 0.7547, and 0.8239. In addition, for 25 FPR, the EPWOD-POAADP approach has achieved a greater TPR of 0.9523 whereas the MPPCA, SF, EADN, and ADPW-FLHHO models have gained the worst TPR of 0.9379, 0.9218, 0.5373, and 0.5188. Moreover, based on 50 FPR, the EPWOD-POAADP approach has gotten a superior TPR of 1.0000 while the MPPCA, SF, EADN, and ADPW-FLHHO models accomplished an inferior TPR of 0.7972, 0.9089, 0.9776, and 0.9857. Finally, depending on 60 FPR, the EPWOD-POAADP method has achieved a maximal TPR of 1.0000 whereas the MPPCA, SF, EADN, and ADPW-FLHHO models attained a lower TPR of 0.8796, 0.9409, 0.9778, and 0.9882.

Table 2.

Comparative analysis of EPWOD-POAADP technique with other approaches below UCSDPed1 dataset.

TPR
FPR	MPPCA	Social Force	EADN	ADPW-FLHHO	EPWOD-POAADP
0	0.0000	0.0000	0.0000	0.0000	0.0000
5	0.0915	0.1315	0.3466	0.5958	0.7129
10	0.2269	0.2455	0.5506	0.7445	0.8606
15	0.3517	0.3676	0.7547	0.8239	0.8906
20	0.4311	0.4578	0.7550	0.8744	0.9017
25	0.9379	0.9218	0.5373	0.5188	0.9523
30	0.5746	0.6329	0.9431	0.9619	0.9719
35	0.6353	0.7175	0.9565	0.9725	0.9865
40	0.6805	0.8104	0.9646	0.9775	0.9805
45	0.7549	0.8821	0.9726	0.9831	0.9904
50	0.7972	0.9089	0.9776	0.9857	1.0000
55	0.8319	0.9324	0.9776	0.9856	1.0000
60	0.8796	0.9409	0.9778	0.9882	1.0000
65	0.9115	0.9486	0.9831	0.9881	1.0000
70	0.9540	0.9566	0.9829	1.0000	1.0000
75	0.9566	0.9751	0.9885	1.0000	1.0000
80	0.9616	0.9831	1.0000	1.0000	1.0000
85	0.9671	0.9829	1.0000	1.0000	1.0000
90	0.9833	0.9831	1.0000	1.0000	1.0000
95	0.9805	0.9938	1.0000	1.0000	1.0000
100	1.0000	1.0000	1.0000	1.0000	1.0000

Open in a new tab

Fig. 3 — Comparative outcome of EPWOD-POAADP technique under UCSDPed1 dataset.

Figure 4 illustrates the TRA Inline graphic (TRAAY) and validation (VLAAY) analysis of the EPWOD-POAADP technique below the UCSDPed1 dataset. The analysis is calculated across an interval of 0–50 epochs. The figure highlights that the TRAAY and VLAAY values exhibit an increasing trend, which informs the capacity of the EPWOD-POAADP technique, which has superior performance across multiple iterations. In addition, the TRAAY and VLAAY leftovers are closer across the epochs, which specifies inferior overfitting and displays the maximum performance of the EPWOD-POAADP technique, guaranteeing reliable prediction on hidden samples.

Fig. 4 — outcome of EPWOD-POAADP technique under UCSDPed1 dataset.

In Fig. 5, the EPWOD-POAADP methodology’s TRA loss (TRALO) and VLA loss (VLALO) display under the UCSDPed1 dataset is demonstrated. The loss values are computed over the range of 0–50 epochs. The TRALO and VLALO values are intended to exemplify a diminishing trend, which informs the method’s capability in balancing a trade-off.

Table 3; Fig. 6 report a detailed Inline graphic study of the EPWOD-POAADP technique below the UCSDPed1 dataset²⁴. The outcomes illustrated that the TSN-RGB, Spatiotemporal, and TSN-Optical Flow techniques have displayed ineffectual outcomes with the least of 90.57%, 91.64%, and 92.91%, individually. In the meantime, the MIL-C3D, Binary SVM, and EADN techniques have shown significant performance with Inline graphic of 95.05%, 96.78%, and 98.41%. Likewise, the ADPW-FLHHO techniques have accomplished reasonable results with of 99.40%. Besides, the EPWOD-POAADP method proves higher performance with a better of 99.51%.

Table 3.

Inline graphic Outcome of EPWOD-POAADP method with existing models under UCSDPed1 dataset.

Models	(%)
EPWOD-POAADP	99.51
ADPW-FLHHO	99.40
EADN method	98.41
Binary SVM method	96.78
MIL-C3D model	95.05
TSN-optical flow method	92.91
Spatiotemporal model	91.64
TSN-RGB	90.57

Open in a new tab

Fig. 6 — analysis of EPWOD-POAADP method under UCSDPed1 dataset.

Table 4; Fig. 7 illustrate the computational time (CT) analysis of the EPWOD-POAADP technique with existing models under the UCSDPed1 dataset. The EPWOD-POAADP technique illustrates the most efficient performance with a CT of 6.39 s, exhibiting a significant improvement over other model. For instance, the ADPW-FLHHO and Binary SVM Method register CTs of 9.63 and 10.45 s, respectively, while the MIL-C3D model and EADN Method exhibit higher CTs of 11.36 and 13.42 s. Furthermore, the TSN-Optical Flow method records 12.91 s, the Spatiotemporal model 12.41 s, and the TSN-RGB 8.41 s. The EPWOD-POAADP model’s reduced CT highlights its suitability for time-sensitive applications, presenting faster processing without compromising performance Inline graphic of 99.03%.

Table 4.

CT analysis of EPWOD-POAADP technique with existing models under UCSDPed1 dataset.

Models	CT (sec)
EPWOD-POAADP	6.39
ADPW-FLHHO	9.63
EADN Method	13.42
Binary SVM Method	10.45
MIL-C3D model	11.36
TSN-Optical Flow method	12.91
Spatiotemporal model	12.41
TSN-RGB	8.41

Open in a new tab

Fig. 7 — CT analysis of EPWOD-POAADP technique with existing models under UCSDPed1 dataset.

Table 5; Fig. 8 describe the ablation study of the EPWOD-POAADP approach with the existing models under the UCSDPed1 dataset. The EPWOD-POAADP approach achieved the highest Inline graphic of 99.51%, significantly outperforming the existing models such as WNN with 98.62%, POA with 98.10%, and CapsNet with 97.34%. Conventional approaches like Faster R-CNN and MF attained lesser with of 96.80% and 96.00%, subsequently. These outputs emphasize the superior anomaly detection capability of the EPWOD-POAADP model and confirm that its enhancements contribute meaningfully to performance gains over both classical and DL-based methods.

Table 5.

Result analysis of the ablation study of EPWOD-POAADP approach under the UCSDPed1 dataset.

UCSDPed1 Dataset
Models	AUC-Score (%)
EPWOD-POAADP	99.51
WNN	98.62
POA	98.10
CapsNet	97.34
Faster R-CNN	96.80
MF	96.00

Open in a new tab

Fig. 8 — Result analysis of the ablation study of EPWOD-POAADP approach under the UCSDPed1 dataset.

Table 6; Fig. 9 show the overall comparative outcomes of the EPWOD-POAADP technique with the existing methods below the UCSDPed2 dataset. The table values suggest that the EPWOD-POAADP technique showed the effectual performances. Depending on 5 FPR, the EPWOD-POAADP technique has gained a greater TPR of 0.7410 whereas the MPPCA, SF, EADN, and ADPW-FLHHO methods have attained lower TPR of 0.0761, 0.1287, 0.3483, and 0.5720. Followed, concerning 15 FPR, the EPWOD-POAADP technique has achieved a greater TPR of 0.9497, whereas the MPPCA, SF, EADN, and ADPW-FLHHO methods have accomplished a minimal TPR of 0.3660, 0.4204, 0.6053, and 0.7937. In addition, depending on 25 FPR, the EPWOD-POAADP technique attained a better TPR of 0.9253 while the MPPCA, SF, EADN, and ADPW-FLHHO approaches realized the worst TPR of 0.5624, 0.6508, 0.7838, and 0.9095. Additionally, for 50 FPR, the EPWOD-POAADP method has gained a greater TPR of 1.0000 whereas the MPPCA, SF, EADN, and ADPW-FLHHO approaches have reached inferior TPR of 0.8333, 0.9455, 0.9692, and 0.9858. Lastly, based on 55 FPR, the EPWOD-POAADP method has gained a superior TPR of 1.0000 while the MPPCA, SF, EADN, and ADPW-FLHHO approaches achieved the worst TPR of 0.9283, 0.9575, 0.9820, and 0.9914.

Table 6.

Comparative result of EPWOD-POAADP technique with other methods under the UCSDPed2 dataset.

TPR
FPR	MPPCA	Social Force	EADN	ADPW-FLHHO	EPWOD-POAADP
0	0.0000	0.0000	0.0000	0.0000	0.0000
5	0.0761	0.1287	0.3483	0.5720	0.7410
10	0.2530	0.2711	0.5350	0.6467	0.8107
15	0.3660	0.4204	0.6053	0.7937	0.9497
20	0.4904	0.4968	0.7487	0.8723	0.9299
25	0.5624	0.6508	0.7838	0.9095	0.9253
30	0.6937	0.7362	0.9277	0.9610	0.9888
35	0.7196	0.8319	0.9500	0.9616	0.9896
40	0.7756	0.8802	0.9557	0.9698	0.9861
45	0.7997	0.9283	0.9630	0.9825	1.0000
50	0.8333	0.9455	0.9692	0.9858	1.0000
55	0.9283	0.9575	0.9820	0.9914	1.0000
60	0.9410	0.9622	0.9867	1.0000	1.0000
65	0.9525	0.9769	0.9964	1.0000	1.0000
70	0.9643	0.9843	0.9912	1.0000	1.0000
75	0.9840	0.9906	0.9904	1.0000	1.0000
80	0.9908	1.0000	1.0000	1.0000	1.0000
85	0.9954	1.0000	1.0000	1.0000	1.0000
90	1.0000	1.0000	1.0000	1.0000	1.0000
95	1.0000	1.0000	1.0000	1.0000	1.0000
100	1.0000	1.0000	1.0000	1.0000	1.0000

Open in a new tab

Fig. 9 — Comparative outcome of EPWOD-POAADP technique under UCSDPed2 dataset.

Figure 10 illustrates the TRAAY and VLAAY analysis of the EPWOD-POAADP technique below the UCSDPed2 dataset. The Inline graphic values are computed within the range of 0–50 epochs. The figure highlights that the TRAAY and VLAAY analysis exhibits an increasing trend, which informed the capacity of the EPWOD-POAADP methodology with maximum performance across several iterations. Simultaneously, the TRAAY and VLAAY remain closer across the epochs, identifying inferior overfitting and exhibiting greater performance of the EPWOD-POAADP technique, promising reliable prediction on hidden samples.

Fig. 10 — curve of EPWOD-POAADP technique under UCSDPed2 dataset.

Figure 11 illustrates the TRALO and VLALO curves of the EPWOD-POAADP approach under the UCSDPed2 dataset is displayed. The loss values are computed within the range of 0–50 epochs. It signifies that the TRALO and VLALO values establish a reducing trend, which informs the capacity of the EPWOD-POAADP method to balance a trade-off.

Fig. 11 — Loss analysis of EPWOD-POAADP technique below UCSDPed2 dataset.

In Table 7; Fig. 12, a thorough Inline graphic experiment of the EPWOD-POAADP methodology below the UCSDPed2 dataset is reported correctly. The outcomes illustrated that the TSN-RGB, Spatiotemporal, and TSN-Optical Flow techniques have displayed ineffectual outcomes with lower of 90.45%, 92.49%, and 94.37%, respectively. In the meantime, the MIL-C3D, Binary SVM, and EADN techniques have demonstrated large performance with Inline graphic of 95.51%, 97.17%, and 98.31%. Furthermore, the ADPW-FLHHO approach has accomplished reasonable results with of 99.20%. Finally, the EPWOD-POAADP approach exhibited maximum performance with an increased of 99.35%.

Table 7.

Inline graphic Outcome of EPWOD-POAADP method with existing models under UCSDPed2 dataset.

Methods	AUC Score (%)
EPWOD-POAADP	99.35
ADPW-FLHHO model	99.20
EADN method	98.31
Binary SVM Method	97.17
MIL-C3D technique	95.51
TSN-Optical Flow system	94.37
Spatiotemporal method	92.49
TSN-RGB algorithm	90.45

Open in a new tab

Fig. 12 — outcome of EPWOD-POAADP method under UCSDPed2 dataset.

Table 8; Fig. 13 specify the CT analysis of the EPWOD-POAADP methodology with the existing models under the UCSDPed2 dataset. The EPWOD-POAADP methodology achieves a CT of 8.12 s, outperforming all comparative approaches and highlighting its optimized execution speed. In contrast, the ADPW-FLHHO, EADN, and Binary SVM models report slower CTs of 11.34, 11.23, and 11.76 s respectively. The MIL-C3D and TSN-RGB approach exhibit CTs of 12.47 and 11.87 s, while the TSN-Optical Flow system and Spatiotemporal method are considerably slower with CTs of 13.72 and 19.52 s. The reduced CT of the EPWOD-POAADP method assists its suitability for latency-critical applications, presenting fast decision-making with a high Inline graphic of 99.03%. This rapid responsiveness makes it ideal for real-time pedestrian safety systems, especially in dynamic urban environments.

Table 8.

CT analysis of EPWOD-POAADP technique with existing models under UCSDPed2 dataset.

Methods	CT (sec)
EPWOD-POAADP	8.12
ADPW-FLHHO model	11.34
EADN method	11.23
Binary SVM method	11.76
MIL-C3D technique	12.47
TSN-optical flow system	13.72
Spatiotemporal method	19.52
TSN-RGB algorithm	11.87

Open in a new tab

Fig. 13 — CT analysis of EPWOD-POAADP technique with existing models under UCSDPed2 dataset.

Table 9; Fig. 14 depict the ablation study of the EPWOD-POAADP methodology with the existing models under the UCSDPed2 dataset. The EPWOD-POAADP methodology attained an Inline graphic of 99.35%, clearly outperforming the existing techniques such as WNN with 98.46%, POA with 97.71%, and CapsNet with 96.95%. Meanwhile, Faster R-CNN and MF achieved lesser of 96.45% and 95.81%, correspondingly. These outputs demonstrate that the EPWOD-POAADP model provides superior anomaly detection performance, validating the impact of its architectural innovations and optimization strategy in handling complex video surveillance data.

Table 9.

Comparative performance evaluation of the EPWOD-POAADP methodology through ablation under the UCSDPed2 dataset.

UCSDPed2 Dataset
Methods	AUC Score (%)
EPWOD-POAADP	99.35
WNN	98.46
POA	97.71
CapsNet	96.95
Faster R-CNN	96.45
MF	95.81

Open in a new tab

Fig. 14 — Comparative performance evaluation of the EPWOD-POAADP methodology through ablation under the UCSDPed2 dataset.

Table 10 indicates the ablation study comparing the computational efficiency of diverse upsampling methods in terms of FLOPs and GPU memory consumption²⁵. The EPWOD-POAADP method attained the lowest FLOPs at 90.34 and the lowest GPU usage at 1200, significantly outperforming all other methods. In contrast, Pixel Shuffle recorded the highest FLOPs at 167.31, while Dysample consumed the most GPU memory at 3530. Other methods like Deconv and Bilinear illustrated relatively higher resource demands, with FLOPs of 143.93 and 135.86, and GPU usage of 2748 and 3049 respectively. These results emphasize that the EPWOD-POAADP model is not only computationally efficient but also highly appropriate for resource-constrained environments.

Table 10.

Comparison of upsampling methods based on flops and GPU usage.

Methods	FLOPs	GPU
Nearest	136.01	2161
Bilinear	135.86	3049
Deconv	143.93	2748
Pixel Shuffle	167.31	2895
Dysample	136.07	3530
CARAFE	135.98	2314
EPWOD-POAADP	90.34	1200

Open in a new tab

Conclusion

In this paper, a novel EPWOD-POAADP method is proposed. The main intention of the EPWOD-POAADP method is to enhance the pedestrian walkways method for blind people’s navigation. At first, the image pre-processing stage applies MF to eliminate the noise in the input data. Besides, the Faster R-CNN model is employed for the object detection process to identify and locate objects within an image. The proposed EPWOD-POAADP model designs the CapsNet model to extract the feature method. Furthermore, the WNN technique is implemented for the detection and classification process. Finally, the POA model performs the hyperparameter range of the WNN model. The experimental evaluation of the EPWOD-POAADP approach is examined using a benchmark image dataset. The results indicated the enhanced performance of the EPWOD-POAADP approach compared to recent approaches. The limitations of the EPWOD-POAADP approach comprise a reliance on a limited dataset, which may affect the generalizability of the results across diverse real-world scenarios. Furthermore, the approach does not address real-time processing constraints, which are significant for practical deployment in dynamic environments. The robustness of the model against varying environmental conditions and occlusions remains unexplored. Furthermore, the scalability of larger and more complex pedestrian networks is not thoroughly evaluated. Future work could explore integrating adaptive learning methods to improve model flexibility, incorporate multi-modal sensor data for improved result, and develop lightweight algorithms suitable for edge computing devices to enable faster, on-site processing.

Acknowledgements

The authors thank the King Salman Center For Disability Research for funding this work through Research Group no KSRG-2024- 143.

Author contributions

All authors wrote the main manuscript text, all authors prepared all figures, all authors analysis results and all authors reviewed the manuscript.

Data availability

The data supporting this study’s findings are openly available at [http://www.svcl.ucsd.edu/projects/anomaly/dataset.html](http:/www.svcl.ucsd.edu/projects/anomaly/dataset.html) , reference number [23].

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Campisi, T., Ignaccolo, M., Inturri, G., Tesoriere, G. & Torrisi, V. Evaluation of walkability and mobility requirements of visually impaired people in urban spaces. Res. Transport. Bus. Manag.40, 100592 (2021).
2.Bentzen, B. L. et al. Wayfinding problems for blind pedestrians at noncorner crosswalks: novel solution. Transp. Res. Rec.2661 (1), 120–125 (2017). [Google Scholar]
3.Chanana, P., Paul, R., Balakrishnan, M. & Rao, P. V. M. Assistive technology solutions for aiding travel of pedestrians with visual impairment. J. Rehabilitation Assist. Technol. Eng.4, 2055668317725993 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Frazila, R. B., Zukhruf, F., Simorangkir, C. O. & Burhani, J. T. Constructing pedestrian level of service based on the perspective of visual impairment person. In MATEC Web of Conferences. Vol. 270. 03009. ( EDP Sciences, 2019).
5.Mattsson, P. et al. Improved usability of pedestrian environments after dark for people with vision impairment: An intervention study. Sustainability12(3), 1096 (2020).
6.Cohen, A. & Dalyot, S. Route planning for blind pedestrians using openstreetmap. Environ. Plann. B: Urban Analytics City Sci.48 (6), 1511–1526 (2021). [Google Scholar]
7.Hsieh, I. H., Cheng, H. C., Ke, H. H., Chen, H. C. & Wang, W. J. A CNN-based wearable assistive system for visually impaired people walking outdoors. Appl. Sci.11(21), 10026 (2021).
8.El-Taher, F. E. Z., Taha, A., Courtney, J. & Mckeever, S. A systematic review of urban navigation systems for visually impaired people. Sensors21(9), 3103 (2021). [DOI] [PMC free article] [PubMed]
9.Mediastika, C. E., Sudarsono, A. S. & Kristanto, L. The sound perceptions of urban pavements by sighted and visually impaired people–A case study in Surabaya, Indonesia. J. Urbanism: Int. Res. Placemaking Urban Sustain.15 (1), 106–129 (2022). [Google Scholar]
10.Sreeraman, Y. et al. Enhancing anomaly detection in pedestrian walkways using improved sparrow search algorithm with parallel features fusion model. Fusion Pract. Appl.14(2). (2024).
11.Bhatlawande, S., Dhande, S., Gupta, D., Madake, J. & Shilaskar, S. Pedestrian and vehicle detection for visually impaired people. In International Conference on Communications and Cyber Physical Engineering 2018. 37–51. (Springer Nature Singapore, 2023).
12.Kumar, A., Chakravarty, A., Choudhary, A. & Indu, S. Camera-based mobility framework for visually impaired pedestrians in unstructured environments. In 2024 IEEE Intelligent Vehicles Symposium (IV). 311–316 (IEEE, 2024).
13.Adi, H. P., Heikoop, R. & Wahyudi, S. I. Enhancing inclusivity: designing disability friendly pedestrian pathways. Int. J. Saf. Secur. Eng.14(3). (2024).
14.Sharma, S. & George, B. A shoe with bipolar electrodes for ground impedance based pedestrian pathway classification. IEEE Sens. J. (2024).
15.Hamadi, A. & Latoui, A. An accurate smartphone-based indoor pedestrian localization system using ORB-SLAM camera and PDR inertial sensors fusion approach. Measurement240, 115642 (2025).
16.Yoshikawa, T. & Premachandra, C. Pedestrian crossing sensing based on Hough space analysis to support visually impaired pedestrians. Sensors23(13), 5928 (2023). [DOI] [PMC free article] [PubMed]
17.Guo, X. & Shen, Z. Smart pedestrian crossing design by using smart devices to improve pedestrian safety. Rev. Adhes. Adhes.11(3). (2023).
18.Ahmed, S. & Islam, S. Methods in detection of median filtering in digital images: A survey. Multimedia Tools Appl.82 (28), 43945–43965 (2023). [Google Scholar]
19.Zia, H. et al. Advancing road safety: A comprehensive evaluation of object detection models for commercial driver monitoring systems. Future Transport., 5(1), 2 (2025).
20.Katkam, S., Tulasi, V. P., Dhanalaxmi, B. & Harikiran, J. Multi-class Diagnosis of Neurodegenerative Diseases using Effective Deep Learning Models with Modified DenseNet-169 and Enhanced DeepLabV3+. (IEEE Access, 2025).
21.Hu, X. et al. Research on RTD fluxgate induction signal denoising method based on particle swarm optimization wavelet neural network. Sensors, 25(2), 482 (2025). [DOI] [PMC free article] [PubMed]
22.Ajenikoko, G. A., Adebayo, I. G. & Adeleke, B. S. Hybridization of Mayfly-Pelican Optimization Algorithm for Selection of CNN Optimal Hyper-Parameters.
23.http://www.svcl.ucsd.edu/projects/anomaly/dataset.html.
24.Alohali, M. A. et al. Anomaly detection in pedestrian walkways for intelligent transportation system using federated learning and Harris hawks optimizer on remote sensing images. Remote Sens.15(12), 3092 (2023).
25.Li, Z. et al. Self-supervised feature contrastive learning for small weak object detection in remote sensing. Remote Sens.17(8), 1438 (2025).

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

[CR1] 1.Campisi, T., Ignaccolo, M., Inturri, G., Tesoriere, G. & Torrisi, V. Evaluation of walkability and mobility requirements of visually impaired people in urban spaces. Res. Transport. Bus. Manag.40, 100592 (2021).

[CR2] 2.Bentzen, B. L. et al. Wayfinding problems for blind pedestrians at noncorner crosswalks: novel solution. Transp. Res. Rec.2661 (1), 120–125 (2017). [Google Scholar]

[CR3] 3.Chanana, P., Paul, R., Balakrishnan, M. & Rao, P. V. M. Assistive technology solutions for aiding travel of pedestrians with visual impairment. J. Rehabilitation Assist. Technol. Eng.4, 2055668317725993 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Frazila, R. B., Zukhruf, F., Simorangkir, C. O. & Burhani, J. T. Constructing pedestrian level of service based on the perspective of visual impairment person. In MATEC Web of Conferences. Vol. 270. 03009. ( EDP Sciences, 2019).

[CR5] 5.Mattsson, P. et al. Improved usability of pedestrian environments after dark for people with vision impairment: An intervention study. Sustainability12(3), 1096 (2020).

[CR6] 6.Cohen, A. & Dalyot, S. Route planning for blind pedestrians using openstreetmap. Environ. Plann. B: Urban Analytics City Sci.48 (6), 1511–1526 (2021). [Google Scholar]

[CR7] 7.Hsieh, I. H., Cheng, H. C., Ke, H. H., Chen, H. C. & Wang, W. J. A CNN-based wearable assistive system for visually impaired people walking outdoors. Appl. Sci.11(21), 10026 (2021).

[CR8] 8.El-Taher, F. E. Z., Taha, A., Courtney, J. & Mckeever, S. A systematic review of urban navigation systems for visually impaired people. Sensors21(9), 3103 (2021). [DOI] [PMC free article] [PubMed]

[CR9] 9.Mediastika, C. E., Sudarsono, A. S. & Kristanto, L. The sound perceptions of urban pavements by sighted and visually impaired people–A case study in Surabaya, Indonesia. J. Urbanism: Int. Res. Placemaking Urban Sustain.15 (1), 106–129 (2022). [Google Scholar]

[CR10] 10.Sreeraman, Y. et al. Enhancing anomaly detection in pedestrian walkways using improved sparrow search algorithm with parallel features fusion model. Fusion Pract. Appl.14(2). (2024).

[CR11] 11.Bhatlawande, S., Dhande, S., Gupta, D., Madake, J. & Shilaskar, S. Pedestrian and vehicle detection for visually impaired people. In International Conference on Communications and Cyber Physical Engineering 2018. 37–51. (Springer Nature Singapore, 2023).

[CR12] 12.Kumar, A., Chakravarty, A., Choudhary, A. & Indu, S. Camera-based mobility framework for visually impaired pedestrians in unstructured environments. In 2024 IEEE Intelligent Vehicles Symposium (IV). 311–316 (IEEE, 2024).

[CR13] 13.Adi, H. P., Heikoop, R. & Wahyudi, S. I. Enhancing inclusivity: designing disability friendly pedestrian pathways. Int. J. Saf. Secur. Eng.14(3). (2024).

[CR14] 14.Sharma, S. & George, B. A shoe with bipolar electrodes for ground impedance based pedestrian pathway classification. IEEE Sens. J. (2024).

[CR15] 15.Hamadi, A. & Latoui, A. An accurate smartphone-based indoor pedestrian localization system using ORB-SLAM camera and PDR inertial sensors fusion approach. Measurement240, 115642 (2025).

[CR16] 16.Yoshikawa, T. & Premachandra, C. Pedestrian crossing sensing based on Hough space analysis to support visually impaired pedestrians. Sensors23(13), 5928 (2023). [DOI] [PMC free article] [PubMed]

[CR17] 17.Guo, X. & Shen, Z. Smart pedestrian crossing design by using smart devices to improve pedestrian safety. Rev. Adhes. Adhes.11(3). (2023).

[CR18] 18.Ahmed, S. & Islam, S. Methods in detection of median filtering in digital images: A survey. Multimedia Tools Appl.82 (28), 43945–43965 (2023). [Google Scholar]

[CR19] 19.Zia, H. et al. Advancing road safety: A comprehensive evaluation of object detection models for commercial driver monitoring systems. Future Transport., 5(1), 2 (2025).

[CR20] 20.Katkam, S., Tulasi, V. P., Dhanalaxmi, B. & Harikiran, J. Multi-class Diagnosis of Neurodegenerative Diseases using Effective Deep Learning Models with Modified DenseNet-169 and Enhanced DeepLabV3+. (IEEE Access, 2025).

[CR21] 21.Hu, X. et al. Research on RTD fluxgate induction signal denoising method based on particle swarm optimization wavelet neural network. Sensors, 25(2), 482 (2025). [DOI] [PMC free article] [PubMed]

[CR22] 22.Ajenikoko, G. A., Adebayo, I. G. & Adeleke, B. S. Hybridization of Mayfly-Pelican Optimization Algorithm for Selection of CNN Optimal Hyper-Parameters.

[CR23] 23.http://www.svcl.ucsd.edu/projects/anomaly/dataset.html.

[CR24] 24.Alohali, M. A. et al. Anomaly detection in pedestrian walkways for intelligent transportation system using federated learning and Harris hawks optimizer on remote sensing images. Remote Sens.15(12), 3092 (2023).

[CR25] 25.Li, Z. et al. Self-supervised feature contrastive learning for small weak object detection in remote sensing. Remote Sens.17(8), 1438 (2025).

PERMALINK

Enhanced pedestrian walkway object detection using deep learning and pelican optimization algorithm for assisting disabled persons

Fadwa Alrowais

Mona Almofarreh

Radwa Marzouk

Abstract

Introduction

Literature of works

Proposed models

Fig. 1.

Stage I: image pre-processing

Stage II: object detection

Stage III: feature extraction

Fig. 2.

Stage IV: pedestrian walkway detection using WNN

Stage V: POA-based parameter tuning

Performance analysis

Table 1.

Table 2.

Fig. 3.

Fig. 4.

Fig. 5.

Table 3.

Fig. 6.

Table 4.

Fig. 7.

Table 5.

Fig. 8.

Table 6.

Fig. 9.

Fig. 10.

Fig. 11.

Table 7.

Fig. 12.

Table 8.

Fig. 13.

Table 9.

Fig. 14.

Table 10.

Conclusion

Acknowledgements

Author contributions

Data availability

Declarations

Competing interests

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases