Abstract
The Internet of Things (IoT) has emerged as a pervasive technological paradigm that interconnects heterogeneous devices and sensors, enabling continuous data acquisition, communication, and intelligent decision-making. However, the large-scale, dynamic, and heterogeneous nature of IoT environments introduces significant cybersecurity threats, making intrusion detection a critical component of IoT network protection. The complexity and high dimensionality of IoT traffic data pose substantial challenges for machine-learning–based intrusion detection systems, particularly for classification accuracy. In this context, feature selection (FS), which aims to identify the most informative and non-redundant features, plays a vital role in enhancing detection performance. This study proposes a model to investigate the FS problem using bird-based metaheuristic optimization algorithms, integrated with a taper-shaped transfer function for binary transformation. The proposed framework aims to identify the most informative and non-redundant features from high-dimensional IoT datasets to enhance classification performance. The kNN, SVM, and RF classifiers are employed to evaluate the model using 10-fold cross-validation. Experimental results on the RT-IoT2022 and IoTID20 datasets show that bird-based FS methods achieve substantial dimensionality reduction and strong classification performance. The Secretary Bird Optimization Algorithm (SBOA), the best-performing model on the RT-IoT2022, identified only 6 of 81 features, achieving the highest feature reduction ratio of 92.59% and a classification accuracy of 99.69%. Moreover, the algorithm selected only 7 of 81 features, achieving a feature reduction ratio of 91.36% and a classification accuracy of 98.46% on the IoTID20 dataset. Additionally, SBOA performs well in terms of sensitivity, specificity, precision, and computational time, underscoring its robustness in handling complex IoT traffic data. The findings indicate that bird-inspired optimization approaches, when integrated with an effective binary transfer mechanism, offer a powerful solution for real-time IoT intrusion detection systems.
Keywords: Internet of things, Feature selection, Bird-based optimization algorithms, Metaheuristic algorithms
Subject terms: Engineering, Mathematics and computing
Introduction
IoT systems include many interconnected devices that continuously generate data and strive to analyze it in real time. The rapid proliferation of IoT technologies has generated large-scale, heterogeneous, and high-dimensional data1. These systems were employed in various fields, including smart cities2, healthcare3, environmental monitoring4, manufacturing5, and energy management6. While this quantity of information presents opportunities for knowledge discovery and intelligent decision-making, it also poses considerable challenges for data analysis and machine learning (ML) tasks. In particular, the presence of redundant or irrelevant features may hinder the accuracy and efficiency of classification models7–9. To address this, feature selection (FS) has become an essential preprocessing step to improve learning performance10, enhance interpretability, and reduce computational overhead11 in IoT-based applications12,13.
Traditional FS methods often struggle to balance computational time and classification accuracy, particularly when applied to IoT-specific datasets. Recently, bio-inspired optimization algorithms have shown great potential for solving complex FS problems10. FS is a vital step in addressing challenges in applications that require processing high-dimensional data. It enables the model to achieve higher accuracy with fewer data points, thereby reducing training time and improving its generalization. Particularly in IoT environments, selecting appropriate features is essential for real-time systems14,15.
FS methods are generally classified into three categories: filters, wrappers, and embedded approaches. The filter method employs statistical criteria to select features, evaluates them independently of the classifier, and identifies the most relevant ones12. Wrapper methods assess the performance of different feature subsets by using a classifier and select the subset that produces the highest accuracy. Finally, embedded methods integrate feature selection within the training process of the classification algorithm. These approaches often face challenges, such as long solution times and difficulty finding optimal solutions, primarily due to the high dimensionality of the data7. Nevertheless, further research is needed to evaluate and compare their effectiveness in IoT contexts.
In recent years, to address these issues, nature-inspired metaheuristic optimization algorithms have been applied to FS problems10,16,17. Approaches such as Particle Swarm Optimization (PSO)7,18, Firefly Algorithm (FA)19, Cuckoo Search20,21, and Grey Wolf Optimization (GWO)22–24 have shown promising results in high-dimensional data, especially in IoT-based classification and intrusion detection (ID) systems.
Among the diverse set of optimization methods, bird-flocking-inspired strategies, in particular, offer efficient mechanisms for global exploration and local exploitation, making them suitable for complex search spaces such as IoT feature subsets. The SBOA, a relatively new method in the literature, provides an innovative solution to optimization problems. Inspired by the hunting techniques of the secretary bird in nature, SBOA offers a functional search approach for high-dimensional, complex problems by balancing exploration and exploitation processes. Additionally, recent studies have demonstrated that SBOA yields statistically significant performance improvement in areas such as wireless sensor networks (WSNs)25–28, engineering optimization29, and clustering25. However, research on SBOA’s performance in IoT-based FS and classification problems is limited.
To address the challenges posed by high-dimensional IoT data, this study presents an ID approach for IoT networks using a bio-inspired metaheuristic. Specifically, it applies an SBOA-based FS strategy within a wrapper framework, employing k-Nearest Neighbor (k-NN) as the fitness function. The method aims to improve classification accuracy while reducing feature dimensionality and loss. The unique aspect of this research is the extension of the SBOA application to IoT-driven FS problems, benchmarking its performance against other bird-based bio-inspired algorithms such as Harris Hawk Optimization (HHO)30, and Cuckoo Search Algorithm (CSA)20,31. We also modified the SBOA algorithm by using the Taper Shaped transfer function, a relatively new method for transforming continuous space to binary. Additionally, we compared the results of bird-based algorithms with those of the binary PSO algorithm.
Experimental results demonstrate that SBOA is a promising alternative for real-time IoT analytics, where both accuracy and efficiency are crucial. Test results on the datasets indicate that SBOA achieves competitive performance in terms of classification accuracy and feature selection.
The main contributions of this study can be outlined as follows:
A novel binary framework for bird-based metaheuristic algorithms is developed to address the FS problem in IoT traffic data.
A taper-shaped transfer function is employed to enhance the balance between exploration and exploitation during the binary transformation process.
The performance of SBOA is systematically compared with HHO and CSA, two widely studied bird-based metaheuristic algorithms, and PSO.
Extensive experiments demonstrate that SBOA achieves competitive performance in terms of accuracy, number of selected features, and computational time.
The remainder of this paper is structured as follows. The Related Works section reviews the relevant literature and highlights existing approaches related to feature selection in IoT systems. The Materials and Methods Section describes the materials and methodological framework adopted in this study, while the Experimental Results section presents the experimental setup and numerical results. The discussion section examines the findings in detail, emphasizing the strengths and limitations of the proposed model and comparing them with related works. Finally, the Conclusion summarizes the study’s main contributions and outlines potential directions for future research.
Related works
The rapid growth of IoT systems has driven extensive research into efficient data processing techniques, especially in the context of FS for high-dimensional datasets. Many approaches have been proposed to address the challenges of dimensionality reduction, ranging from traditional statistical methods12,15 to advanced ML, DL, and metaheuristic optimization techniques32,33. While classical FS methods often provide quick, interpretable results, they can struggle with scalability and accuracy in complex IoT systems. In response, bio-inspired and nature-inspired optimization algorithms have garnered increasing attention, providing robust mechanisms for balancing exploration and exploitation in large search spaces. This section reviews related studies in the literature. It also focuses on both conventional FS methods and recent advances in metaheuristic-based solutions, with particular attention to their applications in IoT environments.
Almohaimeed and Albalwy developed an FS method designed to enhance the effectiveness of an anomaly-based ID system in Internet of Things (IoT) environments. The study used the RT-IoT2022 dataset, which was designed specifically for real-time attack detection. This method combines information gain, gain ratio, correlation-based feature selection, Pearson correlation, and symmetric uncertainty techniques to identify the most essential features. Additionally, the relationships between features and attacks are analyzed in detail12.
In another study15, the authors assessed FS methods combined with principal component analysis (PCA) and DL classifiers to enhance the performance of anomaly-based ID approaches in AIoT and IoT. By applying PCA, Pearson correlation analysis, and artificial neural networks, their method achieved an accuracy of 99.7% on the RT-IoT2022 dataset. The study demonstrated the link between attack types and a small set of features chosen from high-dimensional datasets.
Because traditional FS methods are insufficient for high-dimensional IoT data, nature-inspired metaheuristic optimization algorithms have been employed in FS problems. Kılıç et al. proposed a new multi-population-based PSO method for FS. The proposed approach simultaneously explores the solution space using both a randomly initialized population and a population based on feature-importance values. The authors utilized the Relieff method, and they attained a more efficient search and faster convergence. Experiments on 26 UCI and 3 ASU datasets demonstrate that the algorithm achieves the highest classification accuracy when fewer features are selected, particularly with high-dimensional datasets7.
The authors evaluated various ML algorithms, including Naive Bayes and Support Vector Machine, alongside feature reduction methods (PCA, PSO, and GWO), for anomaly detection in IoT environments. Experiments conducted on the RT-IoT2022 dataset included performance analysis. The results showed that most algorithms were practical for IoT anomaly detection. SVM achieved the highest performance, attaining 99.99% accuracy when combined with PCA34.
Bird behavior-based optimization methods have gained popularity in recent years. Liu and Ni35 introduced a lightweight and effective ID method for resource-limited WSNs. The method employed RobustScaler to remove outliers, Incremental Principal Component Analysis for dimensionality reduction, and HHO to select the most relevant features. Attacks are then classified using a decision tree, achieving an accuracy of 99.65% on the CICIDS2017 dataset.
In another study36, the authors introduce a method based on the SBO for the Set Coverage Problem (SCP) and its variant, Unicost SCP (USCP), both of which are known to be NP-complete. Experimental results demonstrate that binary SBOA (BSBOA) achieves higher solution quality and lower computational cost than GWO and PSO for both SCP and USCP. Although various binary conversion strategies have been used in metaheuristic-based feature selection, most studies rely on static transfer functions, such as Sigmoid (S-shape), V-shape, or U-shape, to map continuous search spaces into discrete domains. However, these conventional functions often maintain fixed transition probabilities throughout the optimization process, undermining the balance between exploration and exploitation. In contrast, the proposed BSBOA–taper function pairing introduces a dynamic mechanism that adaptively adjusts the transition probability. By incorporating the taper function, the algorithm can more effectively control convergence, favoring exploration in the early stages and intensifying exploitation in the later stages. Thereby overcoming stagnation issues commonly encountered with static S-shaped or V-shaped functions on high-dimensional IoT datasets.
Table 1 summarizes feature selection and optimization-based approaches in IoT environments and highlights the position of the proposed method within the existing literature. In summary, the existing literature highlights a clear shift from traditional statistical FS techniques towards more sophisticated metaheuristic and hybrid optimization methods designed for IoT and AIoT environments. While traditional FS methods remain relevant due to their benefits, such as interpretability and simplicity, they face significant limitations in scalability and efficiency when dealing with high-dimensional, real-time data streams. Emerging research indicates that combining metaheuristic optimization algorithms with ML and DL classifiers not only boosts detection accuracy but also enhances computational time, making these methods more suitable for resource-limited IoT systems. Overall, the reviewed research supports a shift toward nature-inspired, lightweight, and adaptive FS strategies that effectively address the growing challenges of IoT security and data analytics. However, despite these advances, a notable gap persists in systematically adapting and benchmarking new optimization algorithms for the specific context of IoT data. In particular, comprehensive evaluations of classification accuracy, computational overhead, scalability, and real-time performance are still limited, underscoring the need for further research.
Table 1.
Summary of related feature selection and optimization-based studies in IoT environments.
| References | Dataset(s) | FS / Optimization method | Classifier / Task |
|---|---|---|---|
| Almohaimeed & Albalwy12 | RT-IoT2022 | IG, GR, CFS, Pearson, SU | Anomaly-based IDS |
| Albalwy & Almohaimeed15 | RT-IoT2022 | PCA, Pearson analysis | ANN (DL-based IDS) |
| Kılıç et al.7 | 26 UCI, 3 ASU | Multi-population PSO + ReliefF | kNN |
| Hamdan et al.34 | RT-IoT2022 | PCA, PSO, GWO | NB, SVM |
| Liu et al.35 | CICIDS2017 | HHO-based FS + IPCA | Decision Tree |
| Crawford et al.36 | SCP, USCP | Binary SBOA | Optimization problem |
| This study | RT-IoT2022, IoTID20 | Bird-inspired metaheuristics + taper-shaped transfer functions (binary FS) | kNN, SVM, RF |
Materials and methods
This section outlines the components of the study’s experimental process, including the dataset, preprocessing steps, FS, and the algorithms employed. First, it provides information about the dataset used, followed by a description of the preprocessing performed before classification. Next, it addresses the FS problem within a wrapper approach framework, aiming to identify suitable feature subsets. For this purpose, a nature-inspired metaheuristic algorithm is discussed. The proposed FS approach is shown in Fig. 1.
Fig. 1.
The model diagram of the proposed model.
As illustrated in Fig. 1, the proposed methodology begins with data acquisition and preprocessing, then applies bird-inspired metaheuristic algorithms with taper-shaped transfer functions for binary feature selection. The selected feature subsets are evaluated using a k-NN-based fitness function with 10-fold cross-validation, and the optimization process iteratively improves the solutions until convergence. The ultimately selected features are used for classification and performance assessment.
Fitness function
A fitness function, also known as an objective or cost function, describes a process that assesses the quality of solutions. The problem must first be formulated as an objective function, after which the proposed algorithm is employed to optimize its solution. This study aims to maximize the accuracy of attack detection and the number of features in the selected subset. The fitness function can be calculated by equation 1 to evaluate the fitness of the solution7.
![]() |
1 |
where F indicates the fitness function, SF is the number of selected features, NF is the total number of features, Err is classification error rate,
and
are the corresponding weights, where
is set to 0.99 and
is set to 0.017,37. The weighting parameters are selected to prioritize classification accuracy, a primary requirement in IoT intrusion detection.
Data preprocessing
Data preprocessing is a critical phase that transforms raw data into a clean, interpretable format for ML algorithms. To ensure model validity and enhance computational time, a systematic preprocessing pipeline was implemented. First, an exploratory data analysis revealed that the ’Prototype’ and ’Service’ attributes contained high-cardinality nominal data with limited predictive power; thus, they were excluded to prevent dimensionality issues and potential overfitting. The ’Attack type’ attribute, serving as the target variable, was binary-encoded, with normal instances mapped to 0 and attack vectors to 1. During data preprocessing for IoTID20, the ’Timestamp’ and ’FlowID’ attributes were removed from the dataset to prevent overfitting to specific network identifiers or time periods. This allows the model to focus on the structural and statistical characteristics of network traffic rather than on specific user identifiers, thereby improving the system’s generalizability across different network environments. Furthermore, we do not need any standardization method to standardize the data. Finally, a comprehensive missing-value analysis was conducted, confirming the dataset’s integrity and revealing no null entries requiring imputation.
Feature selection
Feature selection (FS) is a preprocessing step that identifies the most relevant features for ML models. FS methods are generally classified into three main groups: filter, wrapper, and embedded methods. Filter methods are based on statistical criteria and evaluate features independently. Wrapper methods, on the other hand, are typically used in conjunction with heuristic search algorithms and assess subsets of features using a classifier, such as k-NN. Finally, in embedded methods, FS is conducted in conjunction with model training.
Wrapper methods are preferred, particularly in supervised learning applications, because they provide higher classification accuracy. Furthermore, because of their use with heuristic algorithms, the SBOA was selected in this study to reduce computational cost and mitigate the risk of becoming trapped in a local optimal solution, alongside the wrapper method during the FS process.
Secretary bird optimization algorithm
The secretary bird optimization algorithm is a new metaheuristic inspired by the hunting and predator-escape strategies of the secretary bird. It was proposed by Fu et al. in 202438. Secretary birds are highly effective at hunting snakes. They employ techniques such as wide-area scanning, target fixation, and sudden acceleration to capture prey. These behaviors are adapted into exploration and exploitation phases, offering practical solutions for complex optimization problems. The algorithm has two main phases: an exploration phase, modeled after hunting strategies, and an evaluation phase, modeled after escape strategies.
The positions of the secretary birds represent candidate solutions. Equation (2) is used to initialize their positions within the search area randomly.
![]() |
2 |
where
represents the position of the
secretary bird in the
dimension,
and
are the upper and lower bounds, respectively, and r is a random number between 0 and 1. The population of Secretary Birds can be determined using a matrix as shown in equation (3).
![]() |
3 |
where X is the population of the Secretary Birds,
is the value of the
decision variable presented by the
Secretary Birds, n is the Secretary Birds numbers, and d is the decision variable numbers. The fitness values of birds are stored using equation (4).
![]() |
4 |
With equation 5, once a new candidate position is generated during the exploration or exploitation phase, the algorithm compares the fitness of the new solution (
) with that of the existing solution (
). The new solution is accepted only if it provides a higher fitness value; otherwise, the current solution is retained.
![]() |
5 |
Exploration phase
This phase simulates the Secretary Bird’s search and hunt for prey. It represents a general search within the general search space. The Secretary Bird’s hunting process is evaluated in three equal phases, corresponding to the hunting stages. During this phase, the bird’s search, exhaustion, and attack behaviors are modeled.
Searching for Prey: Secretary birds have sharp vision, enabling them to detect and hunt snakes with ease, even in tall grass. The algorithm randomly selects two solutions from the population and generates a new solution by combining their differences. This promotes diversity and explores new areas in the search space. This movement is modeled using Eq. (6).
![]() |
6 |
where t represents the current iteration number, T represents the maximum iteration number,
represents the value in the
dimension,
and
represent random candidate solutions, and
represents a randomly generated sequence in the range [0,1] in the dimension of the solution space.
Consuming Prey: After a secretary bird discovers its prey during a hunt, it employs a unique strategy. Unlike other birds of prey, the secretary bird doesn’t attack immediately; instead, it maneuvers around its prey with quick footwork. The secretary bird observes its prey from above and tires it out with jumping and provocative movements. The bird’s armored claws and long armored legs make it resistant to prey and provide a significant physical advantage. Brownian motion was used to simulate the secretary bird’s random movement, as in Eq. (7).
![]() |
7 |
where rand(1, Dim) means a randomly generated array of dimension
from a standard normal distribution. In the algorithm, the bird is simulated as focusing on its target by moving toward the best solution found (xbest). This approach allows individuals to search using both global and their own best positions, thus increasing the probability of reaching the global optimum. Brownian motion, by introducing randomness, facilitates better exploration of the solution space and prevents becoming trapped in local optima. The consumption of prey at the prey stage can be modeled by Eq. (8).
![]() |
8 |
where
defines the value in the
dimension,
depicts the current best value.
Attacking Prey: Once the prey’s energy has diminished, it becomes exhausted, and the secretary bird attacks at the most suitable moment. During the attack, the secretary bird uses its sharp talons to kick its prey, quickly killing it. Thus, it avoids any potential danger from the prey. The Levy flight strategy was employed because it enhances the algorithm’s global search capability, reduces the risk of becoming trapped in local solutions, and improves convergence accuracy. Shorter steps improve precision, whereas longer steps facilitate broader exploration of the solution space. Updating the secretary bird’s position at this stage can be modeled mathematically using Eq. (9).
![]() |
9 |
To enhance the algorithm’s optimization accuracy, a weighted Lévy flight mechanism (RL), as defined in Eq. (10), was employed.
![]() |
10 |
where Levy(Dim) means Levy Flight distribution function.
Exploitation phase
Secretary birds’ natural enemies include predators like eagles, hawks, foxes, and coyotes. To defend themselves against these threats, secretary birds typically employ two strategies: running or flying to escape, and hiding by blending into their surroundings. C1 and C2 strategies mimic bird behavior during prey pursuit. C1 denotes positional adjustments observed when birds are near prey; these correspond to local search around the current best solution. Conversely, C2 reflects broader, more stochastic movements used under uncertainty. These movements enable limited global exploration and reduce premature convergence. In the first strategy, demonstrated as
, when secretary birds detect a predator, they first search for camouflage and, if they can’t find it, they quickly run away. This is represented in the algorithm as a slight adjustment to the current solution, thereby improving it toward the best-known solution. If camouflage is ineffective, denoted
, the secretary bird runs or flies; in the algorithm, this situation is modeled by larger, random movements to prevent the solution from stagnating. Both camouflage and escape strategies are modeled using equation (11).
![]() |
11 |
where
denotes a randomly generated array of size (1
Dim) following a normal distribution,
refers to the randomly selected candidate solution in the current iteration, and K represents a randomly chosen integer (either 1 or 2), which is determined according to equation (12).
![]() |
12 |
Binary version of the algorithm
The SBOA is a modern metaheuristic inspired by the natural behavior of the Secretary Bird. It has been used for the past two years because of its effectiveness, especially in continuous search spaces. Additionally, many real-world problems involve binary decision variables rather than continuous ones. For example, in tasks such as FS, object recognition, genetic data analysis, or network security, variables are typically represented as 0 or 1. Since SBOA cannot be directly applied to these problems, a binary version of the algorithm has been created39.
The Binary Secretary Bird Optimization Algorithm (BSBOA), like other binary optimization algorithms, employs a transfer function to adapt the update mechanisms from the continuous solution space to the binary space. Figure 2 shows the transition from continuous SBOA to the binary SBOA. At this stage, continuous variables are first scaled to the [0, 1] interval within a probabilistic framework. These values are then thresholded and converted to binary 0/1, typically using S-, V-, or taper-shaped transfer functions. Thus, the exploration and exploitation phases, which mimic the camouflage, sprinting, and flight strategies of secretary birds, are modeled appropriately in the binary decision space. Algorithm 1 presents the binary version of the SBOA.
Algorithm 1.
Pseudocode of binary secretary bird optimization algorithm
Fig. 2.
Transition from continuous SBOA to the binary SBOA.
Taper-shaped transfer function
Feature selection is a popular preprocessing technique in fields like deep learning, data mining, and ML to reduce model computational costs, improve accuracy, and prevent overfitting40. Transfer functions, which convert solution candidates from the continuous search space into the binary space, play an essential role in feature selection. Among these functions, the taper-shaped model offers a novel perspective. The taper-shaped transfer function, proposed as an alternative to traditional sigmoid or tangent functions, has emerged as a more flexible and adaptable transformation approach41.
S-shaped and V-shaped transfer functions typically involve exponential and trigonometric operations, which can increase computational overhead and, in some cases, affect the execution time of discrete evolutionary algorithms. In comparison, taper-shaped transfer functions rely on simpler probability mappings, which can help reduce computational cost during repeated evaluations. Moreover, their smoother transitions may contribute to more stable convergence by moderating abrupt binary-state changes throughout the search process.
He et al.41, inspired by U-shaped transfer functions, introduced four new taper-shaped transfer functions that employ power functions defined on the symmetric interval
. A indicates that a positive real number. The formulas of the taper-shaped functions are presented in Eqs. (13-16), and curves are shown in Fig. 3.
![]() |
13 |
![]() |
14 |
![]() |
15 |
![]() |
16 |
Fig. 3.
The curves of Taper-shaped transfer functions.
Experimental results
This section evaluates the performance of the proposed feature subset selection method by comparing it with the algorithms CSO, HHO, and PSO. The binary versions of these algorithms are applied to the RT-IoT2022 and IoTID20 datasets, using the same parameters as those listed in Table 2. The control factor, CF, is adaptively adjusted across iterations to balance exploration and exploitation. It starts high to encourage global search and gradually decreases to intensify local exploitation toward the end of the optimization process. Additionally, the K-NN, SVM, and RF classifiers are employed.
Table 2.
Parameters for the algorithms.
| Common parameters | |
|---|---|
| Parameters | Value |
| Number of search individuals | 20 |
| Number of iterations | 100 |
| the number of runs | 10 |
| Parameters for SBOA | |
| CF | potentially decreases at 0 |
Levy flight factor ( ) |
1.5 |
,
|
[0,1],[0,1] |
| Parameters for HHO | |
![]() |
1.5 |
| Parameters for CSO | |
![]() |
0.25 |
k-NN, SVM, and RF were selected to represent distance-based, margin-based, and ensemble-based classifiers, respectively. The superior performance of RF is primarily due to its robustness to noise and its ability to capture complex, nonlinear feature interactions, which are common in high-dimensional IoT datasets.
In this study, the fitness evaluation was performed using a standard k-nearest neighbor (k-NN) classifier with k = 5, employing the Euclidean distance metric. Model performance was assessed using 10-fold cross-validation, and no distance-based weighting was applied. SVM and RF classifiers were implemented using MATLAB’s fitcsvm and fitcensemble functions, respectively, with RF evaluated using 100 trees, while other parameters were kept at default values. Stratified random sampling was applied to maintain class balance across folds, and performance was evaluated using 10-fold cross-validation.
Experiments were conducted on an Intel(R) Core(TM) i9-14900HX CPU (5.80 GHz, 5.60 GHz) with 64 GB of RAM and an NVIDIA GeForce RTX 5070 Ti graphics card. The programming environment is MATLAB R2025a running on Windows 11. To ensure reproducibility, a fixed random seed was used for all stochastic components of the algorithms, and parallel computing was disabled so that all experiments were executed in a single-threaded setting.
Datasets
RT-IoT2022
The RT-IoT2022 dataset is a comprehensive resource created to identify cyberattacks on real-time IoT systems42. This dataset aims to capture both normal operations and attack scenarios of embedded systems running on real-time operating systems. Table 3 summarizes the dataset. The dataset includes explicit threats related to resource consumption and timing-based attacks. It covers scenarios such as timing attacks, DoS, and task hijacking. All data were analyzed using attributes like CPU usage, memory usage, timestamps, system calls, task status, and timings, and were labeled as either attacks or normal. Three classes were labeled as normal, and nine classes as attacks. Information about these classes is provided in Table 4.
Table 3.
Summary of the RT-IoT2022 dataset.
| Information | Description |
|---|---|
| Category | Imbalanced |
| Format | .csv |
| Number of attributes | 85 |
| Total number of samples | 123,117 |
| Services | DNS, SSL, HTTP, MQTT |
| Protocol | TCP, UDP |
Table 4.
RT-IoT2022 dataset classes.
| Class name | Data count | Class type |
|---|---|---|
| MQTT Publish | 829 | Normal |
| Thing Speak | 1622 | Normal |
| Wipro bulb | 51 | Normal |
| DOS SYN Hping | 18932 | Attack |
| ARP Poisoning | 1550 | Attack |
| NMAP UDP SCAN | 518 | Attack |
| DDOS Slowloris | 107 | Attack |
| NMAP XMAS TREE SCAN | 402 | Attack |
| NMAP OS DETECTION | 400 | Attack |
| NMAP TCP scan | 200 | Attack |
| Metasploit Brute Force SSH | 7 | Attack |
| NMAP FIN SCAN | 6 | Attack |
IoTID20
The performance of the proposed algorithm was also evaluated on the IoTID20 dataset. This dataset, presented by Ullah and Mahmoud43, comprises 625,783 instances and 83 features, including critical attack types such as Mirai, DoS, Scan, and MITM, targeting devices (smart cameras and peripherals) in the smart home ecosystem, as detailed in Table 5. In the experiments, feature selection was performed on a randomly sampled 10% subset of the dataset to optimize computational cost and identify the most effective features. In the evaluation phase, the entire dataset was used to assess the model’s generalizability and performance stability. For training and testing the classification model, the 80-20% (Training-Test) hold-out approach, commonly used in the literature, was adopted, and the model’s performance on previously unseen data was rigorously evaluated.
Table 5.
IoTID20 dataset classes.
| Sub-category name | Category name (Frequency) | Data count (Frequency) | Class type (Frequency) |
|---|---|---|---|
| Normal | Normal (6.40%) | 40073 (6.40%) | Normal (6.40%) |
| Mirai-UDP Flooding | Mirai (66.40%) | 183554 (29.30%) | Anomaly (93.60%) |
| Mirai-Hostbruteforceg | 121181 (1.94%) | ||
| Mirai-HTTP Flooding | 55818 (8.90%) | ||
| Mirai-Ackflooding | 55124 (8.80%) | ||
| Scan Port OS | Scan (12%) | 53073 (8.50%) | |
| Scan Hostport | 22192 (3.50%) | ||
| DoS-Synflooding | DoS (9.50%) | 59391 (9.50%) | |
| MITM ARP Spoofing | MITM ARP Spoofing (5.70%) | 35377 (5.70%) |
Confusion matrix and evaluation metrics
The confusion matrix reveals a classification model’s accuracy and provides a comprehensive view of its performance. Table 6 summarizes the key elements of the confusion matrix, which visualizes and summarizes the performance of the classification algorithm. Abbreviations can be described as follows:
True Positive (TP): The situation where the predicted threat is correctly detected.
True Negative (TN): Normal activities are correctly classified as normal.
False Positive (FP): The situation where activities are predicted as threats despite being normal.
False Negative (FN): The situation where activities that are actually threats are classified as normal.
Table 6.
Confusion matrix.
| Predicted values | |||
|---|---|---|---|
| Attack | Normal | ||
|
Actual Values |
Attack | TP | FN |
| Normal | FP | TN | |
The SBOA, implemented for binary FS on the RT-IoT2022 dataset, has been assessed using various metrics. These metrics evaluate the classification performance from different perspectives, demonstrating the model’s reliability and accuracy. The metrics can be calculated using Eqs. (17-22).
![]() |
17 |
![]() |
18 |
![]() |
19 |
![]() |
20 |
![]() |
21 |
![]() |
22 |
Numerical results
This subsection presents the numerical results from the experiments. The performance of four binary optimization algorithms, CSO, HHO, PSO, and SBOA, is evaluated alongside three popular classifiers: k-NN, SVM, and RF. The analysis uses the RT-IoT2022 and IoTID20 datasets and evaluates performance using accuracy, sensitivity, specificity, precision, and F-measure to provide a comprehensive assessment of ID in IoT environments. The best results are highlighted in bold in the tables for clarity. Table 7 presents a comparative performance of the algorithms used in terms of average computation time, number of selected features, and classification performance over 10 runs.
Table 7.
Comparative performance of optimization algorithms on IoT datasets.
| Datasets | Algorithms | Time (min.)(Mean ± Std) | Selected features(Mean ± Std) | Accuracy (%) (Mean ± Std) |
|---|---|---|---|---|
| RT-IoT2022 | BCSO | ![]() |
![]() |
![]() |
| BHHO | ![]() |
![]() |
![]() |
|
| BPSO | ![]() |
![]() |
![]() |
|
| BSBOA | ![]() |
![]() |
![]() |
|
| IoTID20 | BCSO | ![]() |
![]() |
![]() |
| BHHO | ![]() |
![]() |
![]() |
|
| BPSO | ![]() |
![]() |
![]() |
|
| BSBOA | ![]() |
![]() |
![]() |
When the performance metrics in Table 7 are examined, the BSBOA algorithm demonstrates a significant advantage over other metaheuristic methods on both the RT-IoT2022 and IoTID20 datasets. In particular, in terms of feature selection efficiency, BSBOA selected an average of 4.5 features in the RT-IoT2022 dataset and 7 features in the IoTID20 dataset, making it the algorithm that most effectively reduced dimensionality. Despite this dimensionality reduction, BSBOA achieved the highest classification success rates, 99.38% and 99.87%, respectively. In terms of computational cost, the BCSO algorithm incurs the highest cost on both datasets, whereas the BHHO and BPSO algorithms are faster. However, BSBOA has more promising results in accuracy and the balance between accuracy and feature optimization.
Tables 8 and 9 present a comparative analysis of the classification performance of selected feature sets using k-NN, SVM, and RF across different binary optimization algorithms on the RT-IoT2022 and IoTID20 datasets. The tables demonstrate that the RF classifier consistently delivers the most balanced performance and, for the most part, the highest performance across all algorithms and both datasets, achieving superior accuracy, sensitivity, specificity, precision, and F-measure. On the RT-IoT2022 dataset, the BPSO–RF combinations achieve the highest values except for Sensitivity. In contrast, BSBOA demonstrates competitive performance despite using a significantly smaller number of features, highlighting its effectiveness in feature reduction. On the IoTID20 dataset, the BCSO–RF and BPSO–RF combinations achieve the highest classification performance (100% across all metrics), and BHHO–RF also yields comparably high results with a very compact feature subset. In contrast, SVM-based models exhibit imbalanced classification behavior in both datasets, characterized by high sensitivity but zero specificity, indicating a bias toward the majority class. The k-NN classifier exhibits relatively stable performance but remains inferior to RF in overall classification effectiveness.
Table 8.
Classification performance of optimization algorithms with different classifiers on the RT-IoT2022 dataset.
| Algorithms | Classifiers | Features used | Accuracy (%) | Sensitivity (%) | Specificity (%) | Precision (%) | F Measure (%) |
|---|---|---|---|---|---|---|---|
| BCSO | k-NN | 41 | 99.11 | 99.44 | 96.15 | 99.56 | 99.5 |
| SVM | 41 | 98.6 | 99.23 | 93.04 | 99.21 | 99.22 | |
| RF | 41 | 99.83 | 99.97 | 98.62 | 99.84 | 99.90 | |
| BHHO | k-NN | 7 | 99.42 | 99.63 | 97.59 | 99.73 | 99.68 |
| SVM | 7 | 94.88 | 98.42 | 63.60 | 95.99 | 97.19 | |
| RF | 7 | 99.73 | 99.93 | 97.91 | 99.76 | 99.85 | |
| BPSO | k-NN | 42 | 99.26 | 99.56 | 96.55 | 99.61 | 99.59 |
| SVM | 42 | 98.83 | 99.31 | 94.55 | 99.38 | 99.35 | |
| RF | 42 | 99.90 | 99.96 | 99.38 | 99.93 | 99.94 | |
| BSBOA | k-NN | 6 | 99.34 | 99.38 | 99.02 | 99.89 | 99.63 |
| SVM | 6 | 93.13 | 98.81 | 42.94 | 93.87 | 96.27 | |
| RF | 6 | 99.69 | 99.93 | 97.49 | 99.72 | 99.83 |
Table 9.
Classification performance of optimization algorithms with different classifiers on the IoTID20 dataset.
| Algorithms | Classifiers | Features used | Accuracy (%) | Sensitivity (%) | Specificity (%) | Precision (%) | F Measure (%) |
|---|---|---|---|---|---|---|---|
| BCSO | k-NN | 38 | 98.90 | 99.70 | 87.27 | 99.13 | 99.42 |
| SVM | 38 | 93.60 | 100 | 0 | 93.60 | 96.69 | |
| RF | 38 | 100 | 100 | 100 | 100 | 100 | |
| BHHO | k-NN | 5 | 99.88 | 99.97 | 98.65 | 99.91 | 99.94 |
| SVM | 5 | 93.60 | 100 | 0 | 93.60 | 96.69 | |
| RF | 5 | 99.95 | 99.99 | 99.29 | 99.95 | 99.97 | |
| BPSO | k-NN | 33 | 99.98 | 99.68 | 88.73 | 99.23 | 99.45 |
| SVM | 33 | 93.60 | 100 | 0 | 93.60 | 96.69 | |
| RF | 33 | 100 | 100 | 100 | 100 | 100 | |
| BSBOA | k-NN | 7 | 98.30 | 99.64 | 78.70 | 98.56 | 99.09 |
| SVM | 7 | 93.60 | 100 | 0 | 93.60 | 96.69 | |
| RF | 7 | 98.46 | 99.75 | 79.58 | 98.62 | 99.18 |
BSBOA selected six features because this subset yields the optimal balance between accuracy and dimensionality reduction on the RT-IoT2022. The selected features capture key traffic and behavioral characteristics relevant to intrusion detection, highlighting the method’s ability to identify the most informative attributes.Specifically, the selected features,
,
,
,
,
, and
, capture complementary aspects of network behavior, including service, level access patterns, TCP flag anomalies, traffic volume characteristics, and flow-level transmission dynamics. These features are closely associated with common IoT attack behaviors, such as SYN flooding, anomalous payload transmission, and command-and-control communication. Selecting only six features shows that BSBOA effectively removes redundancy while preserving the most informative attributes, resulting in efficient and accurate intrusion detection.
The convergence curves in Figs. 4 and 6 show the iteration-by-iteration performance improvement and the speed at which the proposed and compared algorithms (BCSA, BSBOA, BHHO, and BPSO) reach the minimum error. In both datasets, BSBOA achieved significantly lower fitness values than the competing algorithms. As iterations progressed (especially between the 20th and 60th), BSBOA exhibited a staircase-like decline, escaping local optima and moving toward the global solution. When evaluated on overall performance and minimum error, BSBOA performed best, followed by BHHO, BPSO, and BCSA, which had the highest error. As shown in Fig. 4, BSBOA forms a stable plateau from approximately the 90th iteration onward, indicating that the algorithm converges to the desired optimum value.
Fig. 4.
Comparative convergence curves of the algorithms for RT-IoT2022.
Fig. 6.
Comparative convergence curves of the algorithms for IoTID20.
In the convergence curves shown in Figs. 5 and 7, the error bars above the fitness values reflect the standard deviations obtained from 10 independent runs of the algorithms. The fact that the error bars remain within a narrow range along the convergence curve of the BSBOA algorithm proves that, despite the stochastic (random) nature of the algorithm, it produces similar and stable results in each run. Fig. 8 shows a ROC curve illustrating the performance of an RF classification model optimized with the BSBOA.
Fig. 5.
Convergence curves with error bar on RT-IoT2022 dataset.
Fig. 7.
Convergence curves with error bar on IoTID20 dataset.
Fig. 8.
ROC curve of BSBOA algorithm with random forest classification on RT-IoT2022.
Ablation study
In this study, an ablation study was conducted to thoroughly evaluate the effectiveness of the proposed method. This analysis systematically examines how different model components and design choices contribute to performance and how removing or altering them affects it. It helps determine the importance of each module and offers a deeper understanding of the overall robustness and efficiency of the proposed approach. Table 10 shows the results from all features and selected feature subsets using different classifiers for BCSO, BHHO, BPSO, and BSBOA algorithms to analyze the impact of FS on classification performance. Experiments on the RT-IoT2022 dataset reveal a correlation between algorithms’ feature-selection capabilities and model predictive power. The BSBOA algorithm achieved a 92.59% reduction in dimensionality by selecting six features from the original 81. In comparisons among algorithms, the BHHO algorithm (with seven features) exhibited more stable performance than BSBOA (with six features) across most metrics. For example, with the RF classifier, BHHO achieved 99.73% accuracy, while BSBOA remained at 99.69%. The k-NN classifier achieved 99.02% specificity and 99.89% precision with this minimal feature set, demonstrating that the six selected features effectively distinguish negative classes and reduce false positives. However, when considering overall metric averages, BHHO provided more comprehensive learning by using an additional feature.
Table 10.
Feature selection impact on classification performance on RT-IoT2022 dataset.
| Algorithms | Classifiers | Features used | Accuracy (%) | Sensitivity (%) | Specificity (%) | Precision (%) | F Measure (%) |
|---|---|---|---|---|---|---|---|
| BCSO | k-NN | All features (Dim = 81) | 99.26 | 99.54 | 96.73 | 99.63 | 99.59 |
| Sel. features (Dim = 41) | 99.11 | 99.44 | 96.15 | 99.56 | 99.5 | ||
| SVM | All features (Dim = 81) | 99.05 | 99.48 | 95.23 | 99.46 | 99.47 | |
| Sel. features (Dim = 41) | 98.6 | 99.23 | 93.04 | 99.21 | 99.22 | ||
| RF | All features (Dim = 81) | 99.90 | 99.96 | 99.36 | 99.93 | 99.95 | |
| Sel. features (Dim = 41) | 99.83 | 99.97 | 98.62 | 99.84 | 99.9 | ||
| BHHO | k-NN | All features (Dim = 81) | 99.25 | 99.55 | 96.63 | 99.62 | 99.58 |
| Sel. features (Dim = 7) | 99.42 | 99.63 | 97.59 | 99.73 | 99.68 | ||
| SVM | All features (Dim = 81) | 99.05 | 99.48 | 95.23 | 99.46 | 99.47 | |
| Sel. features (Dim = 7) | 94.88 | 98.42 | 63.6 | 95.99 | 97.19 | ||
| RF | All features (Dim = 81) | 99.89 | 99.97 | 99.24 | 99.91 | 99.94 | |
| Sel. features (Dim = 7) | 99.73 | 99.93 | 97.91 | 99.76 | 99.85 | ||
| BPSO | k-NN | All features (Dim = 81) | 99.25 | 99.54 | 96.62 | 99.62 | 99.58 |
| Sel. features (Dim = 42) | 99.26 | 99.56 | 96.55 | 99.61 | 99.59 | ||
| SVM | All features (Dim = 81) | 99.88 | 99.95 | 99.18 | 99.91 | 99.93 | |
| Sel. features (Dim = 42) | 99.90 | 99.96 | 99.38 | 99.93 | 99.94 | ||
| RF | All features (Dim = 81) | 99.05 | 99.48 | 95.25 | 99.46 | 99.47 | |
| Sel. features (Dim = 42) | 98.83 | 99.31 | 94.55 | 99.38 | 99.35 | ||
| BSBOA | k-NN | All features (Dim = 81) | 99.25 | 99.55 | 96.57 | 99.61 | 99.58 |
| Sel. features (Dim = 6) | 99.34 | 99.38 | 99.02 | 99.89 | 99.63 | ||
| SVM | All features (Dim = 81) | 99.05 | 99.48 | 95.27 | 99.46 | 99.47 | |
| Sel. features (Dim = 6) | 93.13 | 98.81 | 42.94 | 93.87 | 96.27 | ||
| RF | All features (Dim = 81) | 99.88 | 99.96 | 99.22 | 99.91 | 99.93 | |
| Sel. features (Dim = 6) | 99.69 | 99.93 | 97.49 | 99.72 | 99.83 |
Table 11 details the impact of feature selection on classification performance on the IoTID20 dataset. Overall, significant improvements in accuracy and class separation metrics are observed across all optimization algorithms when using selected feature sets, particularly with k-NN and RF classifiers. For example, the BHHO algorithm, despite using only 5 features, achieved effective dimensionality reduction with the k-NN classifier, increasing accuracy from 97.36% to 99.88% and specificity from 70.42% to 98.65%. Similarly, the BPSO and BCSO algorithms significantly increased specificity and F-measure values with fewer features. The RF classifier, which already performed optimally with all features, largely maintained this performance with selected features and, in some cases, achieved nearly identical results in a smaller feature space. These findings demonstrate that the proposed feature selection approaches maintain or improve classification performance while reducing computational load and are effective preprocessing steps, particularly for k-NN and RF-based models.
Table 11.
Feature selection impact on classification performance on IoTID20 dataset.
| Algorithms | Classifiers | Features used | Accuracy (%) | Sensitivity (%) | Specificity (%) | Precision (%) | F Measure (%) |
|---|---|---|---|---|---|---|---|
| BCSO | k-NN | All features (Dim = 81) | 99.34 | 99.16 | 69.82 | 98.03 | 98.59 |
| Sel. features (Dim = 38) | 98.90 | 99.70 | 87.27 | 99.13 | 99.42 | ||
| SVM | All features (Dim = 81) | 93.81 | 100 | 0 | 93.81 | 96.81 | |
| Sel. features (Dim = 38) | 93.60 | 100 | 0 | 93.60 | 96.69 | ||
| RF | All features (Dim = 81) | 100 | 100 | 100 | 100 | 100 | |
| Sel. features (Dim = 38) | 100 | 100 | 100 | 100 | 100 | ||
| BHHO | k-NN | All features (Dim = 81) | 97.36 | 99.13 | 70.42 | 98.07 | 98.60 |
| Sel. features (Dim = 5) | 99.88 | 99.97 | 98.65 | 99.91 | 99.94 | ||
| SVM | All features (Dim = 81) | 93.81 | 100 | 0 | 93.81 | 96.81 | |
| Sel. features (Dim = 5) | 93.60 | 100 | 0 | 93.60 | 96.69 | ||
| RF | All features (Dim = 81) | 100 | 100 | 100 | 100 | 100 | |
| Sel. features (Dim = 5) | 99.95 | 99.99 | 99.29 | 99.95 | 99.97 | ||
| BPSO | k-NN | All features (Dim = 81) | 97.31 | 99.11 | 70.03 | 98.04 | 98.58 |
| Sel. features (Dim = 33) | 99.98 | 99.68 | 88.73 | 99.23 | 99.45 | ||
| SVM | All features (Dim = 81) | 93.81 | 100 | 0 | 93.81 | 96.81 | |
| Sel. features (Dim = 33) | 93.60 | 100 | 0 | 93.60 | 96.69 | ||
| RF | All features (Dim = 81) | 100 | 100 | 100 | 100 | 100 | |
| Sel. features (Dim = 33) | 100 | 100 | 100 | 100 | 100 | ||
| BSBOA | k-NN | All features (Dim = 81) | 97.33 | 99.15 | 69.67 | 98.02 | 98.58 |
| Sel. features (Dim = 7) | 98.30 | 99.64 | 78.70 | 98.56 | 99.09 | ||
| SVM | All features (Dim = 81) | 93.81 | 100 | 0 | 93.81 | 96.81 | |
| Sel. features (Dim = 7) | 93.60 | 100 | 0 | 93.60 | 96.69 | ||
| RF | All features (Dim = 81) | 100 | 100 | 100 | 100 | 100 | |
| Sel. features (Dim = 7) | 98.46 | 99.75 | 79.58 | 98.62 | 99.18 |
Figures 9 and 10 also show the accuracy versus feature-reduction ratio for each classifier on both datasets. Figure 9 shows that the BSBOA algorithm achieved the highest feature reduction rate of 92.59% on the RT-IoT2022 dataset, surpassing traditional methods such as BCSO (49.38%) and BPSO (48.15%) in dimensionality reduction efficiency. Despite this high reduction rate, classification accuracy remains high with k-NN (99.34%) and RF (99.69%), though performance is marginally lower than that of BHHO (99.42% and 99.73%). As shown in Fig. 10, on the IoTID20 dataset, BSBOA achieves a substantial 91.36% reduction, but the BHHO algorithm (93.83%) achieves a higher reduction rate and superior classification performance under this data topology. In terms of classification accuracy, the RF classifier achieves nearly perfect results of 99.95% and 100%, respectively, whereas BSBOA’s performance remains at 98.46%, indicating that the algorithm’s search mechanism has lower accuracy in capturing critical feature interactions in this dataset.
Fig. 9.
Feature reduction and classification performance of optimization algorithms on RT-IoT2022 dataset.
Fig. 10.
Feature reduction and classification performance of optimization algorithms on IoTID20 dataset.
Ablation results indicate that FS substantially affects classification performance, depending on the classifier employed. In the k-NN and RF classifiers, the selected feature subsets yielded results similar to, and sometimes even higher than, those achieved with all features, despite using a significantly smaller number of features. This demonstrates that redundant and irrelevant features are effectively eliminated, resulting in more compact, however powerful models.
Results obtained using the k-NN classifier indicate that FS has a positive impact on performance. Using the BCSO, BHHO, BPSO, and BSBOA algorithms to select feature subsets resulted in slight increases in Accuracy and F-Measure, while enabling the model to operate with fewer features. In particular, the feature sets selected with BSBOA and BHHO achieved accuracies above 99.1%. This suggests that k-NN benefits from feature reduction, and removing unnecessary features improves classification performance.
The SVM classifier results reveal the negative impact of FS. Notable reductions in accuracy and specificity were observed when using feature sets selected by the BCSO, BHHO, and BSBOA algorithms. Specifically, the subsets obtained with BSBOA and BHHO underperformed the complete feature set, with accuracy dropping below 95%. The SVM classifier exhibited a performance decline after feature reduction.
Because S-shaped and V-shaped transfer functions involve exponential, trigonometric, inverse trigonometric, or other non-basic functions, their computational cost may be relatively high. This increases the execution time of the discretized evolutionary algorithm41. To address this, inspired by U-shaped transfer functions, a new class of transfer functions, the taper-shaped transfer function, is employed in this study, using a basis function (i.e., the power function) over a symmetric interval.
Table 12 presents a comparative analysis of the BSBOA algorithm’s performance across the two datasets and different transfer functions, averaged over 10 independent runs. The results obtained from the RT-IoT2022 dataset indicate that the S-shaped transfer function had the highest computation time (277.91 ± 20.26 s) and the highest number of feature selections (7.00 ± 2.98). However, although its accuracy was high (99.33% ± 0.00067), it remained lower than that of other transfer functions. The V-shaped transfer function achieved the lowest average run time (57.66 ± 9.95 s) and the fewest selected features (5.40 ± 1.07), while also achieving the highest accuracy (99.50% ± 0.00019). The taper-shaped transfer function showed moderate performance between the two methods in terms of computation time (171.56 ± 21.64 s) and number of selected features (4.5 ± 3.95). Although the accuracy rate (99.38% ± 0.00033) was high, the high variance, especially in feature selection, indicates that the stability of this approach is lower than that of the V-shaped transfer function. Results from the IoTID20 dataset also show a similar trend. The V-shaped transfer function stands out for computational efficiency, achieving high accuracy (99.87% ± 0.00098), the lowest average runtime (6.85 ± 3.03), and the fewest selected features (1.60 ± 0.52). Although the S-shaped transfer function offers only a limited accuracy advantage, it has a longer runtime and a larger feature set. The Taper-shaped transfer function achieves comparable accuracy (99.87% ± 0.00014) to the V-shaped structure. The results show that the choice of transfer function has a decisive effect on both computational cost and classification performance.
Table 12.
Comparative performance of transfer functions for the BSBOA algorithm on the datasets.
| Datasets | Transfer functions | Time (min.)(Mean ± Std) | Selected features(Mean ± Std) | Accuracy (%) (Mean ± Std) |
|---|---|---|---|---|
| RT-IoT2022 | S Shaped | ![]() |
![]() |
![]() |
| V Shaped | ![]() |
![]() |
![]() |
|
| Taper Shaped | ![]() |
![]() |
![]() |
|
| IoTID20 | S Shaped | ![]() |
![]() |
![]() |
| V Shaped | ![]() |
![]() |
![]() |
|
| Taper Shaped | ![]() |
![]() |
![]() |
Results obtained with the RF classifier show that FS can maintain high performance or only slightly decrease it. Feature subsets selected using the BCSO, BHHO, and BSBOA algorithms yielded results comparable to those obtained with all features, specifically 99.89% accuracy, sensitivity, and F-Measure on the RT-IoT2022 dataset. However, on the IoTID20 dataset, all algorithms achieved 100% accuracy. For example, the feature sets selected with BSBOA and BHHO achieved nearly perfect accuracy of 99.7%. This indicates that RF, due to its strong generalization ability, yields reliable results even with fewer features, whereas FS maintains classification accuracy.
Statistical analysis
In this subsection, to assess and identify significant differences in the results produced by the proposed algorithms across 10 runs, the Wilcoxon rank-sum test is conducted. The Wilcoxon rank-sum test is a well-known non-parametric statistical test. It assesses statistical differences and is typically used to compare the performance of two algorithms. The Wilcoxon rank-sum test was conducted at the 0.05 significance level to assess the statistical significance of performance differences among the algorithms under comparison. This test was selected because the performance metrics across different folds did not follow a normal distribution. A significance level of
was chosen as the standard threshold in computational research to balance the risk of ”False positives” and ”False negatives”. This threshold ensures a 95% confidence level that the observed performance gains of the proposed model are not attributable to chance, and it provides a robust basis for comparing the efficacy of different intrusion detection algorithms.
The obtained p-values are reported in Table 13. For both the RT-IoT2022 and IoTID20 datasets, most pairwise comparisons yield p-values < 0.05, indicating statistically significant differences between the corresponding methods. In particular, BCSO and BSBOA demonstrate statistically significant performance differences when compared to BHHO and BPSO across both datasets. Conversely, comparisons between BHHO and BPSO yield p-values greater than 0.05 (0.38288 for RT-IoT2022 and 0.27195 for IoTID20), indicating no statistically significant difference between the two algorithms. This suggests that BHHO and BPSO exhibit similar optimization behavior, whereas BCSO and BSBOA exhibit distinct, statistically distinguishable performance characteristics. These findings further strengthen the reliability of the comparative analysis by demonstrating that the observed performance improvements are not due to random variation.
Table 13.
Wilcoxon Rank-Sum test results for optimization algorithms on RT-IoT2022 and IoTID20 datasets (p
0.05 are bolded).
| Dataset | BCSO | BSBOA | BHHO | BPSO | |
|---|---|---|---|---|---|
| RT-IoT2022 | BCSO | – | 0.00017 | 0.00023 | 0.00017 |
| BSBOA | 0.00017 | – | 0.00017 | 0.00017 | |
| BHHO | 0.00023 | 0.00017 | – | 0.38288 | |
| BPSO | 0.00017 | 0.00017 | 0.38288 | – | |
| IoTID20 | BCSO | – | 0.00017 | 0.02812 | 0.00211 |
| BSBOA | 0.00017 | – | 0.00630 | 0.00017 | |
| BHHO | 0.02812 | 0.00630 | – | 0.27195 | |
| BPSO | 0.00211 | 0.00017 | 0.27195 | – |
Computational complexity analysis
The computational complexity of the proposed BSBOA mainly depends on the population size N, the maximum number of iterations T, the feature dimension D, and the cost of fitness evaluation. During each iteration, BSBOA updates the positions of all N birds through exploration and exploitation phases, resulting in a position update cost of
. The binarization process using taper-shaped transfer functions introduces negligible overhead because it relies on simple threshold-based operations.
The dominant computational cost arises from the fitness evaluation step, which involves training and validating the classifier. Assuming a fitness evaluation cost of O(C), the overall time complexity of BSBOA can be expressed as in the equation (23):
![]() |
23 |
Compared with other population-based binary metaheuristic algorithms, BSBOA exhibits comparable computational complexity while converging faster, potentially reducing the number of required iterations in practice.
Discussion
This section presents the experimental results from this study, providing insight into the effectiveness of metaheuristic-based feature selection for ID in IoT environments. The study evaluated four binary optimization algorithms (BCSO, BHHO, BPSO, and BSBOA) with respect to computational efficiency, feature reduction, and classification accuracy, revealing notable differences. Specifically, the results indicate a significant trade-off between dimensionality reduction and classification robustness, particularly for the BSBOA algorithm.
Comparison with latest studies
A review of studies in the literature on ID highlights the use of various datasets, such as Bot-IoT, CICIoT2023, and RT-IoT2022. These are widely used due to their timeliness, the diversity of attacks they support, and their extensive feature sets. The RT-IoT2022 was used in this study because it provides traffic data suitable for real-time IoT environments. Given the comprehensive nature of these datasets, feature selection (FS) is crucial for maximizing the effectiveness of the extracted information. In ID systems, FS plays a key role in reducing the computational cost of high-dimensional data while enhancing classification accuracy.
Compared with recent literature, our approach demonstrates that integrating metaheuristic algorithms into FS not only achieves higher accuracy but also addresses the scalability challenges observed in prior work. While studies such as those by Amr et al.44 and Albalwy and Almohaimeed45 report highly competitive results, their reliance on computationally intensive methods raises concerns about real-time deployment. Conversely, approaches that combine classical FS with ML models, such as the study12, yield lower detection rates, indicating a trade-off between complexity and performance. The results summarized in Table 14 demonstrate that the proposed method achieves accuracy comparable to existing studies using the selected features. However, it should be noted that while BSBOA provides an extremely compact feature set, it may exhibit slightly lower predictive performance than BHHO in certain high-dimensional scenarios, reflecting a deliberate trade-off between model parsimony and peak accuracy.
Table 14.
Comparison of the proposed methods with the latest studies.
| Study | Dataset | Methods | Accuracy (%) |
|---|---|---|---|
| Halim et al. (2021)46 | Bot-IoT,UNSW NB-15 | Genetic Algorithm based FS | 98.90 |
| Kareem et al.(2022)37 | NSL-KDD, CICIDS-2017, UNSW-NB15 and BoT-IoT | Metaheuristic Algorithms | 95.59 |
| Kumar et al.47 | ToN-IoT | Hybrid metaheuristic | 99.48 |
| Almohaimeed and Albalwy (2024)12 | RT-IoT2022 | Combined feature selections–MLP | 96.40 |
| Amr et al. (2025)44 | IoTID20, CICIoT2023, RT-IoT2022 | Joint Mutual Information (JMI), XGBoost | 99.53 |
| Albalwy and Almohaimeed (2025)45 | RT-IoT2022 | Pearson–PCA with ANN | 99.70 |
| This study | RT-IoT2022 | Metaheuristic Algorithms | 99.69 |
Another essential point evident from the literature is the dependency of performance on the chosen datasets. Although benchmark datasets such as Bot-IoT, CICIoT2023, and RT-IoT2022 have enabled consistent evaluation across studies, variations in attack scenarios and feature dimensionality often lead to inconsistent results. For instance, models that perform well on Bot-IoT may not generalize as well to RT-IoT2022, which has more complex traffic characteristics. This highlights the need for methods that are robust to dataset heterogeneity. Our proposed approach, evaluated on RT-IoT2022, demonstrates that effective feature selection combined with metaheuristic optimization can sustain high accuracy under such challenging conditions, indicating strong adaptability to broader IoT environments.
Advantages and limitations
The primary advantage of the proposed approach lies in its integration of feature selection with metaheuristic optimization, thereby improving classification accuracy and reducing computational complexity. Unlike traditional FS approaches, which often struggle with high-dimensional datasets, the proposed method effectively identifies the most informative features while minimizing redundancy. This not only enhances detection performance but also makes the approach more suitable for real-time IoT environments, where efficiency is critical. Furthermore, the evaluation on the RT-IoT2022 and IoTID20 datasets demonstrates strong adaptability to complex traffic characteristics, suggesting that the model can handle diverse attack scenarios more effectively than conventional techniques.
Despite these strengths, certain limitations must be acknowledged. A detailed audit of the results across multiple datasets reveals that BSBOA’s performance is context-dependent. For instance, on the IoTID20 dataset, BHHO demonstrated superior performance by identifying a smaller feature subset (5 features) than BSBOA (7 features), while achieving higher classification accuracy. This suggests that BSBOA’s search mechanism may occasionally converge to sub-optimal subsets in specific high-dimensional topologies. Although the RT-IoT2022 dataset provides realistic traffic data, further testing across multiple datasets or in real-world deployments would provide stronger validation of robustness. To enhance transparency, we acknowledge that while BSBOA is highly effective for extreme dimensionality reduction, it does not consistently outperform established metaheuristics in accuracy across all evaluated environments.
While RT-IoT2022 and IoTID20 provide comprehensive frameworks for evaluating intrusion detection, they differ significantly in their attack profiles. RT-IoT2022 focuses heavily on network traffic anomalies and protocol-specific exploits, whereas IoTID20 encompasses a broader range of DoS and Brute-Force scenarios. These differences enable a multifaceted evaluation of the proposed model. However, real-world IoT environments are evolving, with more sophisticated, low-frequency, and heterogeneous attack types. The proposed model is designed to operate in steady state after an initial robust training phase. The model learns a broad spectrum of fundamental attack patterns by leveraging diverse datasets (IoTID20 and RT-IoT2022) during training. In this way, it demonstrates strong generalization and sustains high performance on live, dynamic network traffic without requiring constant retraining.
Several challenges inherent to metaheuristic optimization require further examination. The proposed algorithm is sensitive to hyperparameter initialization, including population size, crossover probability, and mutation rate. Inadequate tuning of these parameters may lead to premature convergence or local optima, requiring significant computational resources for optimization. In terms of scalability, although the method is efficient within the evaluated dimensions, its application to large-scale IoT networks with high-dimensional feature spaces may lead to nonlinear increases in computational overhead, potentially degrading real-time performance. Regarding computational efficiency, our findings show that BSBOA requires slightly more processing time (171.56 s) compared to BHHO (159.86 s) and BPSO (158.68 s), which is an important consideration for deployment in time-critical IoT nodes. Additionally, the model’s robustness to highly imbalanced class distributions and sensor noise, both prevalent in raw IoT traffic, remains a crucial concern. Without dedicated mechanisms, such as cost-sensitive fitness functions, the optimization process may become biased toward majority classes or overfit to noisy data, thereby reducing the ability to detect rare or subtle attack vectors.
Conclusions
The experimental evaluation on the RT-IoT2022 and IoTID20 datasets demonstrates that the proposed binary optimization algorithms achieve highly effective intrusion-detection performance in IoT environments. Among the classifiers, Random Forest (excluding BPSO) consistently performed best across all metrics, achieving accuracies of 99.83% with BCSO, 99.73% with BHHO, and 99.69% with BSBOA on the RT-IoT2022 dataset. Moreover, on the IoTID20 dataset, achieving accuracies of 100% with BCSO and BPSO, 99.95% with BHHO, and 98.46% with BSBOA. These results indicate that integrating a metaheuristic feature selection method with RF classifiers not only reduces the dimensionality of high-volume IoT traffic but also maintains high accuracy, sensitivity, and precision. Such performance is particularly critical in IoT networks, where real-time, reliable detection mechanisms are essential to ensure security and resilience.
In addition to classification accuracy, feature reduction was a significant outcome of this study. The proposed BSBOA algorithm selected only 6 of 81 features, achieving the highest feature reduction ratio of 92.59% on the RT-IoT2022 dataset. Moreover, the algorithm selected only 7 of 81 features, achieving a feature reduction ratio of 91.36% and a classification accuracy of 98.46% on the IoTID20 dataset. This demonstrates that BSBOA can offer a viable alternative for feature selection while producing a more efficient model that reduces computational cost. Although all optimization algorithms achieved promising results, the choice of classifier still strongly influenced performance. SVM, for example, exhibited lower specificity, particularly with smaller feature subsets, thereby limiting its ability to accurately distinguish between normal and attack traffic in IoT contexts. The k-NN produced robust results but still lagged behind RF in overall detection rates. These findings highlight that both feature selection and classifier design are pivotal for intrusion detection in IoT environments.
Future research will address four specific directions to extend the proposed framework. First, the impact of distinct transfer functions will be investigated by comparing S-shaped, V-shaped, and U-shaped families to clarify their roles in balancing the exploration-exploitation trade-off during optimization. Second, to improve practical applicability, the proposed metaheuristic will be hybridized with lightweight Deep Learning architectures, such as 1D-CNN or LSTM, and deployed on resource-constrained edge devices to assess real-time latency. Third, Explainable AI (XAI) techniques, such as SHAP or LIME, will be integrated to provide transparency in feature selection, thereby enabling security analysts to understand the rationale for flagging specific traffic patterns as attacks. Finally, the approach will be evaluated by including additional, diverse datasets, such as BoT-IoT and Edge-IIoT, to test the model’s performance against a broader spectrum of zero-day vulnerabilities and cross-protocol attack vectors.
Author contributions
C.C. conceived the Conception of the work; Acquisition and analysis; Methodology; Funding acquisition; Writing-original draft; and generated the figures. Also, reviewed all work in the manuscript.
Funding
No funds, grants, or other support were received.
Data availability
The datasets used in the current study are available in the UCI Machine Learning Repository and Google Sites, and are available at the following URLs: https://doi.org/10.24432/C5P338, https://sites.google.com/view/iot-network-intrusion-dataset/home.
Code availability
The dataset links used in the current study have been shared in the Repository. The implementation codes of the study and comparative algorithms are available at the GitHub repository: https://github.com/celal-can/Intrusion-Detection-for-IoT-Networkshttps://github.com/celal-can/Intrusion-Detection-for-IoT-Networks.
Declarations
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
Ethics approval and consent to participate
This research did not require additional ethics approval as it involved the analysis of an existing, publicly available dataset that did not contain any identifiable personal information.
Consent for publication
Not Applicable.
Human and animal rights
This article does not include studies conducted by any authors with human participants or animals.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Choudhary, A. Internet of things: A comprehensive overview, architectures, applications, simulation tools, challenges and future directions. Discov. Int. Things4, 31. 10.1007/s43926-024-00084-3 (2024). [Google Scholar]
- 2.Adepoju, P. A., Ige, A. B., Akinade, A. O. & Afolabi, A. I. Smart cities and internet of things (IoT): A review of emerging technologies and challenges. Int. J. Res. Innov. Soc. Sci.9, 1536–1549 (2025). [Google Scholar]
- 3.Parihar, A. et al. Role of IoT in healthcare: Applications, security & privacy concerns. Intell. Pharm.2, 707–714. 10.1016/j.ipha.2024.01.003 (2024). [Google Scholar]
- 4.Govindarajan, U. H., Zhang, C., Raut, R. D., Narang, G. & Galdelli, A. A review of academic and patent progress on internet of things (IoT) technologies for enhanced environmental solutions. Technologies13, 64. 10.3390/technologies13020064 (2025). [Google Scholar]
- 5.Khan, S. I. et al. Implementation of cloud based IoT technology in manufacturing industry for smart control of manufacturing process. Int. J. Interact. Des. Manufact. (IJIDeM)19, 773–785. 10.1007/s12008-023-01366-w (2025). [Google Scholar]
- 6.Nguyen, H. P. et al. Application of the internet of things in 3e (efficiency, economy, and environment) factor-based energy management as smart and sustainable strategy. Energy Sources, Part A: Recov. Utilizat. Environ. Effects47, 9586–9608. 10.1080/15567036.2021.1954110 (2025). [Google Scholar]
- 7.Kılıç, F., Kaya, Y. & Yildirim, S. A novel multi population based particle swarm optimization for feature selection. Knowledge-Based Syst.219, 106894 (2021). [Google Scholar]
- 8.Binbusayyis, A., Alaskar, H., Vaiyapuri, T. & Dinesh, M. An investigation and comparison of machine learning approaches for intrusion detection in IOMT network. J. Supercomput.78, 17403–17422. 10.1007/s11227-022-04568-3 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Can, C., Kilic, F. & Kaya, Y. A novel approach for location planning of fast-charging stations for e-buses. Int. J. Simulat. Modell. (IJSIMM)10.2507/IJSIMM24-2-714 (2025). [Google Scholar]
- 10.Dinç, B. & Kaya, Y. Hbdfa: An intelligent nature-inspired computing with high-dimensional data analytics. Multimed. Tools Appl.83, 11573–11592. 10.1007/s11042-023-16039-9 (2024). [Google Scholar]
- 11.Üstün, K., Kılıç, F. & İbrahim Halil Yılmaz. Design of spectrally selective multilayer stacks with optimized properties for mid-temperature concentrating solar applications. Solar Energy Mater. Solar Cells 276, 113072, 10.1016/j.solmat.2024.113072 (2024).
- 12.Almohaimeed, M. & Albalwy, F. Enhancing IoT network security using feature selection for intrusion detection systems. Appl. Sci.10.3390/app142411966 (2024). [Google Scholar]
- 13.Sundaram, K., Natarajan, Y., Perumalsamy, A. & Yusuf Ali, A. A. A novel hybrid feature selection with cascaded LSTM: Enhancing security in IoT networks. Wireless Commun. Mobile Comput.2024, 5522431. 10.1155/2024/5522431 (2024).
- 14.Hussein, A. M. et al. A smart IoT-cloud framework with adaptive deep learning for real-time epileptic seizure detection. Circuits, Syst. Signal Process.44, 2113–2144. 10.1007/s00034-024-02919-4 (2025). [Google Scholar]
- 15.Albalwy, F. & Almohaimeed, M. Advancing artificial intelligence of things security: Integrating feature selection and deep learning for real-time intrusion detection. Systems 13, 2025, 10.3390/systems13040231.
- 16.Rani, R. & Sharma, S. A literature review of nature-inspired optimization and its applications in the internet of things. Available at SSRN 5188349 2025, 10.2139/ssrn.5188349.
- 17.Ji R, P. D., Kumar N. Hybrid enhanced intrusion detection frameworks for cyber-physical systems via optimal features selection. Indian J. Sci. Technol. 17, 3069–3079 (2024).
- 18.Benmalek, M. & Seddiki, A. Particle swarm optimization-enhanced machine learning and deep learning techniques for internet of things intrusion detection. Data Sci. Manag.10.1016/j.dsm.2025.02.005 (2025). [Google Scholar]
- 19.Thamer Francis, G., Souri, A. & İnanç, N. A hybrid firefly-based attribute selection and split-point mechanism for securing software-defined industrial internet of things. J. High Speed Netw.10.1177/09266801251338138 (2025).
- 20.Yang, X.-S. & Deb, S. Cuckoo search via lévy flights. In 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC), 210–214, 2009, 10.1109/NABIC.2009.5393690.
- 21.Kaya, Y. Feature selection using binary cuckoo search algorithm. In 2018 26th Signal Processing and Communications Applications Conference (SIU), 1–4, 2018, 10.1109/SIU.2018.8404843.
- 22.Maazalahi, M. & Hosseini, S. A novel hybrid method using grey wolf algorithm and genetic algorithm for IoT botnet ddos attacks detection. Int. J. Comput. Intell. Syst.18, 61. 10.1007/s44196-025-00774-y (2025). [Google Scholar]
- 23.Ji, R., Kumar, N. & Padha, D. CNN-GWO-voting & hybrid: Ensemble learning inspired intrusion detection approaches for cyber-physical systems. Proc. Indian Nat. Sci. Acad.91, 848–862 (2025). [Google Scholar]
- 24.Karthic, S., Manoj Kumar, S. & Senthil Prakash, P. N. Grey wolf based feature reduction for intrusion detection in wsn using lstm. Int. J. Inform. Technol. 14, 3719–3724, 10.1007/s41870-022-01015-7 (2022).
- 25.Abdulkareem, M., Aghdasi, H. S., Salehpour, P. & Zolfy, M. Binary secretary bird optimization clustering by novel fitness function based on voronoi diagram in wireless sensor networks. Sensors (Basel, Switzerland) 25, 4339, 2025, 10.3390/s25144339. [DOI] [PMC free article] [PubMed]
- 26.Sundaram, K., Subramanian, S., Natarajan, Y. & Thirumalaisamy, S. Improving performance of intrusion detection using alo selected features and gru network. SN Comput. Sci.4, 809. 10.1007/s42979-023-02311-0 (2023). [Google Scholar]
- 27.Suhana, S., Karthic, S. & Yuvaraj, N. Ensemble based dimensionality reduction for intrusion detection using random forest in wireless networks. In 2023 5th International Conference on Smart Systems and Inventive Technology (ICSSIT), 704–708, 2023, 10.1109/ICSSIT55814.2023.10060929.
- 28.Karthic, S. & Kumar, S. M. Hybrid optimized deep neural network with enhanced conditional random field based intrusion detection on wireless sensor network. Neural Process. Lett.55, 459–479 (2023). [Google Scholar]
- 29.Meng, Q., Kuang, X., Yu, Z., He, M. & Cui, H. Augmented secretary bird optimization algorithm for wireless sensor network deployment and engineering problem. PLoS One20, e0329705. 10.1371/journal.pone.0329705 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Heidari, A. A. et al. Harris hawks optimization: Algorithm and applications. Future Generat. Comput. Syst.97, 849–872. 10.1016/j.future.2019.02.028 (2019). [Google Scholar]
- 31.Kaya, Y. A novel method for optic disc detection in retinal images using the cuckoo search algorithm and structural similarity index. Multimed. Tools Appl.79, 23387–23400. 10.1007/s11042-020-09080-5 (2020). [Google Scholar]
- 32.Talukder, M. A., Sharmin, S., Uddin, M. A., Islam, M. M. & Aryal, S. Mlstl-wsn: machine learning-based intrusion detection using smotetomek in wsns. Int. J. Inform. Secur.23, 2139–2158. 10.1007/s10207-024-00833-z (2024). [Google Scholar]
- 33.Babu, K. S., Revathi, K. L., Narisetty, N. J. & Dadhirao, C. Optimizing feature selection in imbalanced intrusion detection system using hybrid metaheuristics algorithms for wireless sensor networks. Cluster Comput.28, 575. 10.1007/s10586-025-05248-6 (2025). [Google Scholar]
- 34.Hamdan, A., Tahboush, M., Adawy, M., Alwada’n, T. & Ghwanmeh, S. Feature reduction and anomaly detection in iot using machine learning algorithms. Int. J. Adv. Comput. Sci. Appl. 10.14569/IJACSA.2025.0160146 (2025). [DOI]
- 35.Liu, Y. & Ni, F. Intrusion detection in wireless sensor networks: A lightweight scheme. IAENG Int. J. Comput. Sci. 52 (2025).
- 36.Crawford, B. et al. Binary secretary bird optimization algorithm for the set covering problem. Mathematics 13, 10.3390/math13152482. (2025).
- 37.Kareem, S. S., Mostafa, R. R., Hashim, F. A. & El-Bakry, H. M. An effective feature selection model using hybrid metaheuristic algorithms for IoT intrusion detection. Sensors10.3390/s22041396 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Fu, Y., Liu, D., Chen, J. & He, L. Secretary bird optimization algorithm: A new metaheuristic for solving global optimization problems. Artif. Intell. Rev.57, 123. 10.1007/s10462-024-10729-y (2024). [Google Scholar]
- 39.Crawford, B. et al. Binary secretary bird optimization algorithm for the set covering problem. Mathematics13, 2482. 10.3390/math13152482 (2025). [Google Scholar]
- 40.Theng, D. & Bhoyar, K. K. Feature selection techniques for machine learning: A survey of more than two decades of research. Knowledge Inform. Syst.66, 1575–1637. 10.1007/s10115-023-02010-5 (2024). [Google Scholar]
- 41.He, Y., Zhang, F., Mirjalili, S. & Zhang, T. Novel binary differential evolution algorithm based on taper-shaped transfer functions for binary optimization problems. Swarm Evolut. Comput.69, 101022. 10.1016/j.swevo.2021.101022 (2022). [Google Scholar]
- 42.Sharmila, B. S. & Nagapadma, R. Quantized autoencoder (qae) intrusion detection system for anomaly detection in resource-constrained iot devices using rt-iot2022 dataset. Cybersecurity6, 1–15. 10.1186/s42400-023-00178-5 (2023). [Google Scholar]
- 43.Ullah, I. & Mahmoud, Q. H. A scheme for generating a dataset for anomalous activity detection in IoT networks. In Goutte, C. & Zhu, X. (eds.) Advances in Artificial Intelligence, 508–520 (Springer International Publishing, Cham, 2020).
- 44.Amr, M. N., Mekkawy, T., Mahran, A. & Elliethy, A. Hybrid feature selection for efficient machine learning-based intrusion detection in IoT networks. In 2025 15th International Conference on Electrical Engineering (ICEENG), 1–6, 2025, 10.1109/ICEENG64546.2025.11031273.
- 45.Albalwy, F. & Almohaimeed, M. Advancing artificial intelligence of things security: Integrating feature selection and deep learning for real-time intrusion detection. Systems13, 231. 10.3390/systems13040231 (2025). [Google Scholar]
- 46.Halim, Z. et al. An effective genetic algorithm-based feature selection method for intrusion detection systems. Comput. Secur.110, 102448. 10.1016/j.cose.2021.102448 (2021). [Google Scholar]
- 47.Dey, A. K., Gupta, G. P. & Sahu, S. P. Hybrid meta-heuristic based feature selection mechanism for cyber-attack detection in iot-enabled networks. In Procedia Computer Science 218, 318–327, 10.1016/j.procs.2023.01.014 (2023). International Conference on Machine Learning and Data Engineering.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets used in the current study are available in the UCI Machine Learning Repository and Google Sites, and are available at the following URLs: https://doi.org/10.24432/C5P338, https://sites.google.com/view/iot-network-intrusion-dataset/home.
The dataset links used in the current study have been shared in the Repository. The implementation codes of the study and comparative algorithms are available at the GitHub repository: https://github.com/celal-can/Intrusion-Detection-for-IoT-Networkshttps://github.com/celal-can/Intrusion-Detection-for-IoT-Networks.

















































































