Abstract
The Internet of Things (IoT) and Industrial IoT (IIoT) have rapidly evolved, reshaping modern industries through intelligent automation and seamless real-time connectivity. Additionally, the inherent heterogeneity, limited resources, and distributed structure of a network has made them vulnerable. Those issues are used by the cyber attacker to gain an unauthorized access to the system, including data leakage, insider misuse, and Distributed Denial of Service (DDoS) attacks. To counter these security risks, we work to introduce a novel model ASTRID-Net, an innovative deep learning architecture building a Triple attention hybrid model that includes multi-scale convolutional feature extraction, bidirectional recurrent modeling, and Residual learning for high-accuracy intrusion detection in IoT and IIoT networks. The framework integrates multi-scale Convolutional Neural Networks (CNNs) to extract spatial features, Bidirectional Gated Recurrent Units (BiGRUs) to capture temporal relationships, and a combined channel–temporal attention mechanism to prioritize the most relevant information in the data. Experimental evaluation reveals that ASTRID-Net attains an outstanding 99.97% accuracy, with macro-averaged precision, recall, and F1-score surpassing 99.97%, outperforming conventional deep learning baselines. These results confirm the effectiveness and scalability of ASTRID-Net for real-time detection of complex cyber threats in IoT/IIoT infrastructures, contributing to the development of secure and adaptive cyber-physical systems.
Keywords: Squeeze-and-Excitation (SE) block), Multi-Scale convolutional neural networks, Bidirectional gated recurrent unit (BiGRU), Triple attention, Multi-scale convolutional neural network (CNN), Edge-IIoTset dataset, Cyber-Physical systems security
Subject terms: Engineering, Mathematics and computing
Introduction
There are different types of intrusion in IoT and IIoT refers to any unauthorized access, malicious activity, or attack that compromises the confidentiality, integrity, or availability of connected devices, systems, or data across industrial and consumer networks. People can use the IoT and IIoT infrastructures for their critical domains such as manufacturing, energy, healthcare, and transportation, they present a vastly expanded attack surface due to their heterogeneous nature, resource-constrained devices, and reliance on lightweight protocols1–3. We develop an intelligent system for solving the intrusion identification issues of IoT and IIoT system. An intelligent system has capability to accurately identifying and classifying diverse cyber threats .
In our recent research, threat landscape in IoT and IIoT environments surpasses the capabilities of conventional security mechanisms, which were developed for static, homogenous, and resource-rich systems4,5. Unlike traditional networks, IoT/IIoT infrastructures operate under stringent constraints, including limited memory, processing power, and real-time responsiveness, making them highly vulnerable to novel and adaptive cyberattacks that can disrupt critical industrial operations, compromise safety, and cause significant economic and reputational damage. Furthermore, the increasing deployment of autonomous and mission-critical IoT systems demands proactive, intelligent, and context-aware security solutions6–8.
Intrusion detection in IoT and IIoT environments presents substantial challenges due to their dynamic, distributed, and heterogeneous characteristics. These networks comprise a wide range of devices with different computational capabilities, communication protocols, and data formats, which makes the development of a unified security framework highly complex. Additionally, the massive volume and rapid velocity of real-time data, along with the limited availability of labeled attack instances—particularly for rare or stealthy intrusions—hinder accurate threat detection. Factors such as noisy data, imbalanced class distributions, and the demand for low-latency processing further restrict the effectiveness of conventional deep learning approaches, highlighting the need for more adaptive, specialized, and resource-efficient solutions9,10. Despite significant progress in the field of intrusion detection, current research exhibits notable limitations when applied to IoT and IIoT environments. A prevailing gap lies in the over-reliance on obsolete or artificially synthesized datasets such as KDD99, NSL-KDD, or Bot-IoT which fail to reflect the real-time complexity, protocol diversity, and device heterogeneity characteristic of modern cyber-physical systems11. Consequently, many proposed models demonstrate high accuracy in controlled settings but struggle to generalize to real-world deployments. Moreover, a large body of existing work employs shallow machine learning techniques with manually engineered features, which are insufficient for capturing the high-dimensional and temporal dependencies inherent in network traffic data generated by IoT and IIoT devices1. Even contemporary deep learning approaches often overlook the importance of adaptive attention mechanisms, treating all input features and time steps with equal weight, thereby diminishing their ability to detect low-frequency, stealthy, or evolving threats. Additionally, most models are developed under the assumption of centralized data availability, ignoring the privacy constraints and computational limitations of edge-based or federated IIoT architectures, where data decentralization and bandwidth constraints are critical operational factors12. These limitations collectively indicate a significant research gap: the need for robust, scalable, and context-aware intrusion detection frameworks that are capable of learning from realistic, multi-modal data, extracting spatio-temporal patterns, and adapting intelligently to the dynamic threat landscape in resource-constrained, decentralized industrial environments.
To reduce the shortcomings of existing intrusion detection systems that fail to capture complex spatiotemporal dependencies and feature interactions. At this stage, we introduce a ASTRID-Net (Adaptive Spatiotemporal Residual-Interpretable Detection Network) that specifically designed for IoT and IIoT security.
Initially, our proposed model used a Residual Multi-Scale Convolutional Block, which employs parallel convolutions with varying kernel sizes to extract multi-granular spatial representations from raw network traffic data. This design enables the model to learn both fine-grained and large-scale attack signatures, while the residual connection preserves essential low-level features and enhances gradient flow.
We introduce a triple-attention mechanism that refines feature representations across temporal, channel, and spatial dimensions. First, a Temporal Attention Module pays attention on dynamically assigns different time steps. It also ensure the critical moments in network traffic. Next, a Channel Attention Mechanism is applied through a squeeze-and-excitation block that can re-weights feature channels to emphasize the most discriminative patterns while suppressing irrelevant ones. Finally, a Spatial Attention Block highlights salient regions across time and feature dimensions. enabling the network to focus on subtle and spatially distributed attack signatures.
Key contributions of this research are listed below:
We develop ASTRID-Net, a hybrid multi-scale convolutional neural network model that uses blocks to effectively extract spatial features at different temporal resolutions from IoT/IIoT network traffic.
Our model incorporates channel-wise Squeeze-and-Excitation and spatial attention mechanisms to emphasize significant features and relevant positions for improved intrusion detection accuracy.
The proposed model on large-scale IoT/IIoT datasets to achieve scalable, robust, and real-time detection of diverse cyber threats in resource-constrained environments.
The next section of this paper will present the Related Works, followed by the Proposed Methodology, then the Experimental Results and Analysis, and finally the Conclusion along with the References.
Related work
The security scopes of IoT and IIoT networks are rapidly changing critically. Traditional Intrusion Detection Systems (IDS) performs static, signature-based methods that can fall when facing novelor zero-day attacks. This limitation has enforced to shift toward behavior-driven and intelligent detection approaches. Leveraging machine learning (ML) and deep learning (DL), we can learn complex traffic patterns, dentify anomalies, and dynamically adapt to emerging threats. Multi-attention CNN is making them well-suited for heterogeneous and resource-constrained IoT/IIoT environments.
In previous research, many of them employed classical ML models such as Decision Trees, Support Vector Machines (SVMs), and K-Nearest Neighbors, achieving moderate success on some well established security dataset, including KDD99 and NSL-KDD. However, these approaches are less effective to analysis the modern threat and intrusion detection in real-time IIoT scenarios due to their reliance on handcrafted feature engineering and limited ability to adapt to evolving network behaviors2,13,14. At this stage, more recent research has applied the deep learning (DL) architectures, including CNNs, RNNs, and LSTMs, which can automatically extract hierarchical features and model temporal dependencies in traffic data, leading to higher detection accuracy15–17. Despite these improvements, many DL-based solutions still neglect attention mechanisms and often underperform when detecting minority-class attacks or handling highly imbalanced datasets18,19. In36, a fusion/aggregation strategy was adopted to improve the performance of detection model and reduce communication overhead by allowing participating clients to involve in the federation process dynamically. In Table 1 illustrates the limitations of our recent research works.
Table 1.
Summary of research conducted in intrusion detection.
| Ref.No. | Methods | Accuracy | Limitation | Scope |
|---|---|---|---|---|
| 1 | RL (DQN) | 97% | Training cost, zero-day gaps | Advanced RL, federated learning |
| 2 | Feature Selection + Ensemble | 99.99% | Centralized, not real-time | Scalable & federated IDS |
| 3 | 1D CNN, LSTM, RNN, MLP | 99.1–99.5% | Latency, black-box | Edge computing, XAI |
| 4 | FFNN, LSTM, RandNN | 96–99.9% | High compute | Hybrid & federated IDS |
| 5 | ML + DL Hybrid | 99.99% | Dataset-specific | Hybrid IDS, auto feature selection |
| 6 | Random Forest + Ensemble | 99% | Dataset-specific | Multi-dataset, robust IDS |
At this stages, hybrid deep learning models combining CNNs with GRUs or LSTMs have recently emerged that shown an outstanding approach for capturing spatial–temporal dependencies in network traffic20. Nevertheless, many of these models lack integrated attention mechanisms capable of focusing on the most informative features and time steps, which reduces both interpretability and the ability to detect stealthy or low-frequency attacks. In addition, most evaluations are conducted on outdated or synthetic datasets, which do not fully represent the diversity and complexity of modern IIoT infrastructures21,22.
Another key challenge in existing research is the limited support for decentralized or privacy-aware intrusion detection. Although several federated learning–based IDS solutions have been proposed23, they rarely combine deep spatio-temporal modeling with real-time adaptability, restricting their applicability to industrial settings. Furthermore, critical factors such as explainability and computational efficiency are necessary for deployment on resource-constrained edge devices that often overlooked19,24. Recently, Many research have explored integrating attention mechanisms, including Graph Attention Networks and temporal attention modules25,26, improving ensemble architectures, and developing lightweight IDS models for smart grids and edge computing environments27. Using heterogeneous dataset such as Edge-IIoTset, few works are proposed and a comprehensive solution that unifies multi-scale CNNs, bidirectional temporal modeling, and dual attention mechanisms, particularly one validated on a realistic.
To overcome these shortcomings, we introduces a novel triple-attention method called ASTRID-Net that combines a residual multi-scale CNNs and spatial, temporal and channel attention. The proposed model is rigorously evaluated on the Edge-IIoTset dataset, showing strong performance in detecting a broad spectrum of attacks—including rare classes—while maintaining scalability, interpretability, and suitability for real-world IIoT deployment.
Proposed methodology
The methodology of this study involved three components: (1) proposed Multi stages CNN method, (2) added Bi-directional GRU, and (3) analysis the model with Triple attention functions for the features extraction that can rank as the best classification for both frequent and rare cyberattacks. Figure 1 illustrates the working principle of this research. Here, our proposed model shows a classification for multiple intrusion, and also shows an outstanding classification for binary class. ASTRID-Net model is an Adaptive Spatiotemporal Residual-Interpretable Detection Network that has three attention mechanism for capturing the important features for classification.
Fig. 1.

Methodology of attention basedv ASTRID-Net model for Intrusion Detection.
Figure 2 shows, we have been creating a multi stages CNN model that can concatenate with each others. However, our proposed model also adds a type of recurrent neural network architecture is called Bi-directional GRU.
Fig. 2.
Methodology of multi stages CNN model for intrusion detection.
Then, we address three attention function while the model is extracting the most important features. Finally we add Global Maxpooling1D layers, and a output layer is used for both binary and multiclass classification.
Dataset overview and preprocessing
Edge-IIoTset dataset28 dataset is used in this research. It is a comprehensive and realistic intrusion detection indicator specifically designed for IoT and IIoT security research. It comprises approximately 2,219,201 network flow records, covering 15 attack types along with normal traffic, distributed across diverse communication protocols such as TCP, UDP, MQTT, HTTP, Modbus TCP, and ICMP. Each sample contains 63 features describing various packet-level and protocol-level behaviors.
The preprocessing involves the following steps:
-
i.
Handling Missing Values: The dataset has two different phases, including features and labels. Here some numerical features, missing values were imputed by the median of the respective feature. For categorical features, missing entries were substituted with the mode of the corresponding feature.
-
ii.
Removal of Non-Predictive or Redundant Features: Every feature is not essential for predict attack level. Features that do not contribute to the predictive task were removed to prevent potential bias or information leakage during model training.
iii. Feature Normalization: StandardScaler is used to convert the numerical features into zero mean and unit variance. iv. Encoding Categorical Features: Using Label Encoding, assigning a unique integer to each distinct category. Then, we yield the categorical variables that can transform into integer representations .
ASTRID-Net architecture
The proposed ASTRID-Net has designed to efficiently extract spatial, sequential, and attention-based representations from IoT/IIoT network traffic data.
In this research, the multi-scale CNN block contributes the most because it performs the primary and most effective feature extraction from raw traffic data. Here, we add the BiGRU layer that use for modeling temporal dependencies in the flow sequences. Furthermore, the three attention mechanisms enhance the performance by refining and highlighting the most informative features. Algorithm 1 shows the ASTRID-Net model .
Algorithm 1.
ASTRID-Net for intrusion detection.
The network consists of five main modules: Input layer, Residual Multi-Scale Convolution Module, Attention Mechanism, Squeeze-and-Excitation Block, Classification Module. The description of each component is given below:
Input layer
Our proposed ASTRID-Net, the input layer is designed for handling one dimensional feature vector. Initially the input layer is defined as
. Where, d is the input dim_to make this model complex with subsequent convolutional and recurrent operations. Our proposed model reshape with two dimensional sequence, where the first dimension corresponds to sequence length and the second dimension corresponds to single feature. After reshaping the CNN/RNN, we obtain
![]() |
.
Here d is the sequence vector and 1 dimensional feature.We apply a reshape function that can treat each scalar as a timestep with one feature channel.
Residual multi-scale convolutional block
In this section, we try to capture all the patterns at different temporal scales. Multi-scale convolutions propose small and large size kernal for capturing local and global features. Additionally, we add the concatenation function to increase the feature richness, because each kernal size contributes complementary information. Moreover, a residual connection is used to normalized features with projection input.
![]() |
Multi-scal convolution :
![]() |
Here, B is the batch size, T represents the sequential length, and is the input feature dimension. For each kernal size
, a 1D convolution is applied with F filters (here F = 64):
![]() |
are convolution filters.
is the bias.
Attention mechanism
In this study, we propose a triple attention mechanisms that can focus on most relevant features and time steps.
Temporal attention
In Table 2, we illustrate a temporal attention block for extracting the most important time steps in the sequence which are required for intrusion detection. It uses a Dense layer with 1 unit (to produce a scalar score for each time step) followed by a tanh activation for initial transformation. Then we have used a flatten layer and passed through a softmax activation to normalize them so that the sum across all time steps is 1. The resulting attention weights are then repeated across the feature dimension (RepeatVector) and permuted to align with the original input shape before being multiplied with the sequence, effectively scaling each time step’s features by its importance. Figure 3 shows the attention layers for different important features extraction.
![]() |
![]() |
Table 2.
Three attention blocks with different parameters.
| Attention type | Layer / operation | Parameters |
|---|---|---|
| Channel attention | GlobalAveragePooling1D | None |
| Dense (1st FC) | units = channels // ratio, activation=’relu’ | |
| Dense (2nd FC) | units = channels, activation=’sigmoid’ | |
| Reshape | target_shape = (1, channels) | |
| Multiply | None | |
| Spatial attention | Lambda (avg_pool) | axis=-1, keepdims = True |
| Lambda (max_pool) | axis=-1, keepdims = True | |
| Concatenate | axis=-1 | |
| Conv1D | filters = 1, kernel_size = 7, padding=’same’, activation=’sigmoid’ | |
| Multiply | None | |
| Temporal attention | Dense | units = 1, activation=’tanh’ |
| Flatten | None | |
| Activation | Softmax | |
| RepeatVector | n = input_seq.shape[-1] | |
| Permute | dims = [2, 1] | |
| Multiply | None |
Fig. 3.
Design the five attention layers with five neurons.
Channel attention
Here, channel attention focuses on identifying the most useful information.
In Fig. 4, we apply Global Average Pooling over the time dimension to “squeeze” temporal information into a single vector per channel. Then, we have used two Dense layers. The first dense reduces the channel dimensionality by a factor of ratio = 16 (reduction ratio) with a ReLU activation, and the second restores the original channel dimension with a sigmoid activation, producing scaling factors between 0 and 1. The Reshape step ensures compatibility with the input shape before the result is multiplied with the original feature maps to “excite” important channels.
Fig. 4.
A deep learning attention based framework for intrusion detection in IoT and IIoT Networks.
![]() |
![]() |
Spatial attention block
Spatial attention determines which positions (time steps) within the sequence should be emphasized. It computes both average pooling and max pooling along the channel axis, producing two 1D maps of size equal to the time dimension. These are concatenated along the channel dimension and passed through a Conv1D layer with 1 filter, kernel size = 7, and sigmoid activation to produce a spatial attention map. In Fig. 5, we add conv1D layer for highlighting the spatial features.
Fig. 5.

Pipelining representations for spatial attention block.
![]() |
![]() |
Therefore, we use the spatial attention mechanism to highlight informative regions by exploiting the spatial relationships among feature maps.
Squeeze operation: global context embedding
In this stage, Our model aggregates the global information with temporal dimension by applying a Global Average Pooling (GAP) operation. Squeeze operation captures the overall statistical distribution of each feature channel, effectively reducing the feature map into a compact global descriptor that summarizes the entire sequence. Finally, given an input feature map
, where B is the batch size, T is the sequence, and C is the number of channels, the squeeze operation computes the global average pooling vector 
![]() |
.
Excitation operation: channel recalibration
Therefore, the excitation mechanism learns non-linear interactions among channels. RelU and sigmoid activations function assign the important weights to each channel. The recalibrated weights are then applied back to the original feature maps, allowing the network to emphasize informative channels while suppressing less relevant ones.
Model definition
We design the model that has an input layer accepting 63 features, which are reshaped into a three-dimensional tensor to make them suitable for convolutional processing. We apply three different size kernel to capture diverse local patterns, each producing 64 feature maps. Using concatenation layer, we merge the outputs for forming a rich multi-scale representation of 192 feature channels. According to Table 3, we use a batch normalization layer to stabilize training and a dropout layer (rate = 0.3) to reduce overfitting. Next, a bidirectional GRU is used with 64 units in both directions processes sequential dependencies, outputting 128 features per timestep. Then channel attention (SE-style) highlights the most relevant feature channels. Additionally, a GlobalMaxPooling1D layer convert the sequence information into a compact 128-dimensional vector. At this stage, the dense layer refines all the features with 128 units and ReLU activation. Finally, our proposed model predicts the target class through a dense softmax layer with 15 units.
Table 3.
Our proposed ASTRID-Net Architectures.
| Layer / Block | Output Shape | Parameters / Notes |
|---|---|---|
| Input | (None, 63) | Input layer for 63 features |
| Reshape | (None, 63, 1) | Reshapes 1D input to 3D for Conv1D |
| Conv1D (kernel = 3) | (None, 63, 64) | 64 filters, kernel size = 3, activation=’relu’, padding=’same’ |
| Conv1D (kernel = 5) | (None, 63, 64) | 64 filters, kernel size = 5, activation=’relu’, padding=’same’ |
| Conv1D (kernel = 7) | (None, 63, 64) | 64 filters, kernel size = 7, activation=’relu’, padding=’same’ |
| Concatenate (Multi-scale CNNs) | (None, 63, 192) | Merges 3 Conv1D paths (64*3 = 192) |
| BatchNormalization | (None, 63, 192) | Normalizes features |
| Dropout | (None, 63, 192) | Dropout rate = 0.3 |
| Bidirectional GRU | (None, 63, 128) | GRU units = 64 forward + 64 backward, return_sequences = True |
| Temporal Attention | (None, 63, 128) | Softmax over timesteps to weigh temporal features |
| Channel Attention | (None, 63, 128) | SE-style channel attention with reduction_ratio = 8 |
| GlobalMaxPooling1D | (None, 128) | Pools across timesteps |
| Dense(Fully Connected) | (None, 128) | Units = 128, activation=’relu’ |
| Dropout | (None, 128) | Dropout rate = 0.4 |
| Output Dense | (None, 15) | Units = num_classes = 15, activation=’softmax’ |
Model training strategy
The training strategy of the proposed ASTRID-Net model is carefully designed to ensure robust convergence, mitigate overfitting, and maximize generalization performance on the dataset. To ensure robust, we proposed an adaptive learning strategy with different hyperparameters. We have selected a callbacks function to optimize performance and prevent overfitting. Moreover, the Adam optimizer provides efficient gradient-based updates with an initial learning rate of 0.001. Furthermore, this learning rate helps to prevent our loss values. Sparse Categorical Crossentropy is used for multi-class classification with integer-encoded labels. We use a large batch size to stabilize gradient updates and train for a maximum of 10 epochs, with early stopping monitoring the validation loss (patience = 5) and restoring the best weights.
The ASTRID-Net Model illustrates training and compiled steps. The dataset is divided into train and validation folder with 80% and 20% ratio.
Training configuration
In the training process of our proposed model, three different callback strategies are used to prevent overfitting. The Earlystopping measures the validation loss
at each epoch t and halts raining when no improvement is observed after 7 consecutive epochs. In our previous research36, we had applied earlystopping and model checkpoint. Now, we apply ReduceLROnPlateau to adjust learning rate when progress slows. Here, let learning rate at epoch t be 
![]() |
is used with the improvement within 3 epochs Otherwise, 0.5.
. We use ReduceLROnPlateau for shrinking a learning rate.
Evaluation metrics
We evaluate the performance of the proposed ASTRID-Net model. Initially, we calculate confusion matrix that widely applied for detecting the classified and misclassified attributes in intrusion detection tasks. The classification report is the key indicators that insights into the model’s effectiveness in correctly identifying both normal and attack instances, particularly in the presence of class imbalance. The confusion matrix further complements this analysis by detailing the true positives, true negatives, false positives, and false negatives, enabling a granular understanding of misclassification patterns.
IFinally, we analysis the loss function, and our model uses a customized training strategy with different Callback function.
Results and analysis
Our proposed model has build on Google Colab using the TensorFlow/Keras deep learning framework. The proposed ASTRID-Net model integrates residual multi-scale convolution blocks, Bidirectional GRUs, and multiple attention mechanisms, including temporal, channel (squeeze-and-excitation), and spatial attention. At first, multi-scale convolutional layers extract features in both forward and backward direction. This layer follows the bidirectional GRU strategy to capture temporal attention. Temporal attention focus on the most important time steps, while the squeeze-and-excitation block assigns channel-wise importance to highlight informative features. Moreover, the spatial attention block focuses on significant positions within the sequence. We apply the Global max Pooling layer that can used for feature maps. This layer is followed by a fully connected layer with ReLu activation function and Dropout for regularization. Therefore, a softmax output layer provides class probabilities, enabling accurate sequence classification.
Initially, we focus on the evaluation metrics such precision, recall, and F1-score, Table 4 shows our proposed model performance across different types of attack. This results demonstrate that ASTRID-Net achieves excellent performance across 15 different types of attack. Figure 6 illustrates Backdoor, DDoS_ICMP, SQL_injection, Password, Vulnerability_scanner, and Ransomware also achieve very high metric values, and it also reflects the model’s robustness in recognizing their presence. Here, we observe a slightly lower performance for the Ransomware class, with precision (0.9757), recall (0.9927), and F1-score (0.9841), suggesting a small number of misclassifications. Therefore, our model achieves exceptionally good multiclass intrusion detection performance, overall accuracy of 99.97%, with macro-averaged precision, recall, and F1-score all above 99.3%.
Table 4.
Classification report for multiclass classification.
| Class | Precision | Recall | F1-score |
|---|---|---|---|
| Backdoor | 0.9992 | 0.9895 | 0.9943 |
| DDoS_HTTP | 1.0000 | 1.0000 | 1.0000 |
| DDoS_ICMP | 1.0000 | 0.9999 | 1.0000 |
| DDoS_TCP | 1.0000 | 1.0000 | 1.0000 |
| DDoS_UDP | 1.0000 | 1.0000 | 1.0000 |
| Fingerprinting | 1.0000 | 1.0000 | 1.0000 |
| MITM | 1.0000 | 1.0000 | 1.0000 |
| Normal | 1.0000 | 1.0000 | 1.0000 |
| Password | 1.0000 | 0.9999 | 1.0000 |
| Port_Scanning | 0.9850 | 0.9876 | 0.9863 |
| Ransomware | 0.9757 | 0.9927 | 0.9841 |
| SQL_injection | 0.9999 | 1.0000 | 1.0000 |
| Uploading | 1.0000 | 1.0000 | 1.0000 |
| Vulnerability | |||
| _scanner | 1.0000 | 0.9989 | 0.9995 |
| XSS | 0.9966 | 1.0000 | 0.9983 |
| Accuracy | 0.9997 | ||
| Macro Average | 0.9971 | 0.9979 | 0.9975 |
| Weighted Average | 0.9997 | 0.9997 | 0.9997 |
Fig. 6.

Performance metrics for all the attacks using our proposed method.
Figure 7 illustrates the confusion matrix for multiclass classification using the proposed ASTRID-Net model. The results show near-perfect classification across all classes, including highly imbalanced ones like MITM, Fingerprinting, and XSS, demonstrating the model’s strong generalization and precise discrimination of complex attack patterns in IIoT environments.
Fig. 7.

Confusion matrix for multiclass classification.
Figure 8 shows, the training and validation accuracy of our model was outstanding at initial stages. This model demonstrates a rapid increase in both training and validation accuracy within the first few epochs, stabilizing above 99.97% accuracy. This indicates that the model is learning effectively and generalizing well.
Fig. 8.

Accuracy and Loss curve against number of epochs for multi class intrusion detection.
At the beginning of the training process, the the validation accuracy remains closely aligned with training accuracy, we see a minor fluctuation but it does not mean an overfitting issue. The loss plot (right) shows a steep decline in both training and validation loss within the first two epochs, followed by a gradual convergence to near-zero values. In this diagram, loss remains constant, there is no small spike in validation loss during all those epochs. Finally, the ASTRID-Net model achieves fast convergence and maintains strong generalization performance throughout the training process.
Table 5 shows perfect performance of the proposed model in binary classification, achieving 100% precision, recall, and F1-score for both attack and normal classes, confirming its exceptional accuracy in anomaly detection.
Table 5.
Classification report for binary (anomaly) class classification.
| Class / Metric | Precision | Recall | F1-Score |
|---|---|---|---|
| Attack | 1.0000 | 1.0000 | 1.0000 |
| Benign | 1.0000 | 1.0000 | 1.0000 |
| Accuracy | 1.0000 | ||
| Macro Average | 1.0000 | 1.0000 | 1.0000 |
| Weighted Average | 1.0000 | 1.0000 | 1.0000 |
Figure 9 displays the confusion matrix for binary classification, where the ASTRID-Net model perfectly distinguishes between normal and attack traffic with zero misclassifications, confirming its exceptional reliability for anomaly detection.
Fig. 9.
Confusion matrix for binary (anomaly) class classification.
According to Fig. 10, our proposed model shows almost zero validation loss throughout the training process. Finally, the evaluation is required for the unseen test set that can provide a more detailed analysis to confirm the model’s generalization capability. We get a great prediction from this binary classification. The training history plots illustrate the performance of a binary classification model is shown over 7 epochs. However, both training and validation accuracy rapidly reach 100%. In this research, the validation accuracy remains relentlessly flat at 1.0 across all epochs.
Fig. 10.

Accuracy and Loss curve against number of epochs for binary class intrusion detection .
Table 6 presents a comparative analysis of different machine learning and deep learning approaches applied to various intrusion detection datasets. The RNN model on the NSL-KDD dataset achieved an accuracy of 92.18% with a precision of 90.23% and an F1-score of 90.29%. On the IoTID20 dataset, the RF model achieved a very high accuracy of 98.68%. In [34], a DCNN obtained 98.12% accuracy and precision (97.13%), recall (97.8%), and F1-score (97.46%). Therefore, our proposed model achieves the highest accuracy compare then others.
Table 6.
Comparative analysis of state-of-art analysis.
| References | Model | Dataset | Accuracy | Precision | Recall | F1-Score |
|---|---|---|---|---|---|---|
| 29 | RNN | NSL-KDD | 0.9218 | 0.9023 | – | 0.9029 |
| 30 | RF | IoTID20 | 0.9868 | – | – | – |
| 31 | DCNN | IoTID20 | 0.9812 | 0.9713 | 0.978 | 0.9746 |
| 32 | FCFFN | IoT | 0.9374 | 0.9371 | 0.938 | 0.9347 |
| 33 | Ensemble | IoT-23 | 0.996 | – | – | – |
| 34 | DT XGBOOST | UNSW-NB15 | 0.9085 | – | – | – |
| 35 | DAE-DNN | CSE-CIC-ID2018 | 0.9579 | 0.9538 | 0.958 | 0.9511 |
| 36 | Flow Transformer | SJTU-AN21 | 0.86 | 0.868 | – | 0.855 |
| 37 | CNN-BILSTM | N_BaIoT dataset | 99.52% | - | - | - |
| Proposed | ASTRID-Net | Edge-IIoTset | 0.9997 | 0.9971 | 0.9979 | 0.9975 |
| 38 | CST-AFNet | Edge-IIoTset | 0.9997 | 0.9997 | 0.9997 | 0.9997 |
In this research, we apply a DNN model on Edge-lloTset dataset. Here, we achieve 72.80% accuracy but this model has some miss classification in some classes. We discuss a comparison of our published model CST-AFNet39. This model is applied on Edge-lloTset dataset and evaluates the performance of this model against our proposed ASTRID-Net model.
Conclusion and future work
This research presents a customized ASTRID-Net model that incorporates three attention mechanisms for intrusion detection in IoT and IIoT environments. Performance of this model is evaluated on the Edge-IIoTset dataset for both binary and multiclass classification tasks. This model utilizes the multi-scale CNNs, BiGRU layers, and temporal–channel attention modules to capture rich spatio-temporal patterns from network traffic. In addition to, our proposed model demonstrates an extraordinary experimental results, achieving up to 100% accuracy in binary classification and 99.97% accuracy in multiclass classification, with particularly strong results on minority classes. These outcomes yield the robustness and effectiveness of ASTRID-Net in detecting a wide range of complex cyberattacks, making it a promising solution for real-world IIoT security applications. In future, we will apply federated and edge computing settings to enhance scalability and data privacy. The model also exhibited near-zero validation loss throughout training, further confirming its stability. Moreover, incorporating online learning and explainable AI (XAI) techniques is expected to improve adaptability against evolving threats and provide greater interpretability for security analysts.
Author contributions
A.Z. (Ashrafun Zannat) designed the study, curated the dataset, implemented the model, and carried out the experimental analysis. M.S.A. (Md Shakil Ahmmed) contributed to methodology refinement, validation, visualization, and assisted with the preparation of figures and tables. M.A.H. (Md. Alamgir Hossain) supervised the project, provided guidance throughout the research process, and critically revised the manuscript. A.Z. wrote the original draft of the manuscript, and M.S.A. and M.A.H. reviewed and edited it. A.S.M (Alifa Shanzidah Manarat) provided guidance throughout the research process, and critically revised the manuscript. All authors discussed the results, contributed to the interpretation of findings, and approved the final version of the manuscript. M.S.I (Md. Saiful Islam )supervised the project, provided guidance throughout the research process.
Data availability
The dataset analyzed during this study is publicly available in IEEE DataPort under the title *Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications: Centralized and Federated Learning, 10.21227/MBC1-1H68.
Code availability
The full implementation of the proposed framework, including both binary‑ and multiclass classification notebooks, is publicly available on GitHub at: https://github.com/ashrafunzannat/ASTRID-Net.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Bakhsh, S. A. et al. Enhancing IoT network security through deep learning-powered intrusion detection system. Internet Things. 24, 100936. 10.1016/j.iot.2023.100936 (2023). [Google Scholar]
- 2.Orman, A. Cyberattack detection systems in industrial internet of things (IIoT) networks in big data environments. Appl. Sci.15 (6), 3121. 10.3390/app15063121 (2025). [Google Scholar]
- 3.Md, A. et al. Deep learning and ensemble methods for anomaly detection in ICS security. Int. J. Inf. Technol.10.1007/s41870-024-02299-7 (2024). [Google Scholar]
- 4.Salayma, M. Risk and threat mitigation techniques in internet of things (IoT) environments: a survey. Front. Internet Things. 2, 1306018, 10.3389/friot.2023.1306018 (2024).
- 5.Md, A., Hossain & Islam, M. S. Ensuring network security with a robust intrusion detection system using ensemble-based machine learning. Array19, 100306. 10.1016/j.array.2023.100306 (2023). [Google Scholar]
- 6.Zhukabayeva, T., Zholshiyeva, L., Karabayev, N., Khan, S. & Alnazzawi, N. Cybersecurity solutions for industrial internet of Things–Edge computing integration: Challenges, Threats, and future directions. Sensors25 (1), 213. 10.3390/s25010213 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wakili, A. & Bakkali, S. Privacy-preserving security of IoT networks: A comparative analysis of methods and applications. Cyber Secur. Appl.3, 100084. 10.1016/j.csa.2025.100084 (2025). [Google Scholar]
- 8.Saif, S., Hossain, M. A. & Islam, M. S. IoT security fortification: Enhancing cyber threat detection through feature selection and advanced machine learning. in 2024 1st International Conference on Innovative Engineering Sciences and Technological Research (ICIESTR) (2024). 10.1109/ICIESTR60916.2024.10798181
- 9.Md, A., Hossain, S., Saif & Islam, M. S. Interpretable machine learning for IoT Security: Feature selection and explainability in botnet intrusion detection using extra trees classifier. in 2024 1st International Conference on Innovative Engineering Sciences and Technological Research (ICIESTR) (2024). 10.1109/ICIESTR60916.2024.10798158.ke
- 10.Hossain, M. S., Hossain, M. A. & Islam, M. S. I-MPaFS: enhancing EDoS attack detection in cloud computing through a data-driven approach. J. Cloud Comput.13 (1), 151. 10.1186/s13677-024-00699-5 (2024). [Google Scholar]
- 11.Ahanger, T. A., Ullah, I., Algamdi, S. A. & Tariq, U. Machine learning-inspired intrusion detection system for iot: security issues and future challenges. Comput. Electr. Eng.123, 110265. 10.1016/j.compeleceng.2025.110265 (2025). [Google Scholar]
- 12.Liu, J. et al. FedCD: A hybrid federated learning framework for efficient training with IoT devices. IEEE Internet Things J.11 (11), 20040–20050. 10.1109/JIOT.2024.3368216 (2024). [Google Scholar]
- 13.Alsaleh, S., Menai, M. E. B., Al-Ahmadi, S., Heterogeneity-Aware, A. & Semi-Decentralized model for a lightweight intrusion detection system for IoT networks based on federated learning and BiLSTM. Sensors25 (4), 1039. 10.3390/s25041039 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Srinivasan, M. & Senthilkumar, N. C. Intrusion detection and prevention system (IDPS) model for IIoT environments using hybridized framework. IEEE Access.13, 26608–26621. 10.1109/ACCESS.2025.3538461 (2025). [Google Scholar]
- 15.Popoola, S. I. et al. Multi-Stage deep learning for intrusion detection in industrial internet of things. IEEE Access.13, 60532–60555. 10.1109/ACCESS.2025.3557959 (2025). [Google Scholar]
- 16.Saidane, S., Telch, F., Shahin, K. & Granelli, F. Deep graphsage enhancements for intrusion detection: analyzing attention mechanisms and GCN integration. J. Inf. Secur. Appl.90, 104013. 10.1016/j.jisa.2025.104013 (2025). [Google Scholar]
- 17.Mukisa, K. J., Chijioke Ahakonye, L. A., Kim, D. S. & Lee, J. M. Enhancing IIoT Security Using Hybrid CNN-BiLSTM Models with Blockchain Integration. in 2025 International Conference on Artificial Intelligence in Information and Communication (ICAIIC) 0459–0464 (IEEE, 2025). 10.1109/ICAIIC64266.2025.10920640
- 18.Asif, S. OSEN-IoT: an optimized stack ensemble network with genetic algorithm for robust intrusion detection in heterogeneous IoT networks. Expert Syst. Appl.276, 127183. 10.1016/j.eswa.2025.127183 (2025). [Google Scholar]
- 19.Zhai, J. et al. Industrial IoT intrusion attack detection based on composite attention-driven multi-layer pyramid features. Comput. Netw.263, 111207. 10.1016/j.comnet.2025.111207 (2025). [Google Scholar]
- 20.Zhou, H. et al. CBCTL-IDS: A transfer Learning-Based intrusion detection system optimized with the black kite algorithm for IoT-Enabled smart agriculture. IEEE Access.13, 46601–46615. 10.1109/ACCESS.2025.3550800 (2025). [Google Scholar]
- 21.Gayathri, K., Tulasi Kumar, K., Sai Vignesh, L., Ajay Prathap, P. & Anusha, G. Leveraging smart sentry to detect and mitigate cyber threats in industrial IoT networks, (2025). 10.5281/ZENODO.15163499
- 22.Lu, K. D. et al. Multi-Objective discrete extremal optimization of Variable-Length Blocks-Based CNN by joint NAS and HPO for intrusion detection in IIoT. IEEE Trans. Dependable Secure Comput., 10.1109/TDSC.2025.3545363
- 23.Nandanwar, H. & Katarya, R. Securing industry 5.0: an explainable deep learning model for intrusion detection in cyber-physical systems. Comput. Electr. Eng.123, 110161. 10.1016/j.compeleceng.2025.110161 (2025). [Google Scholar]
- 24.Zhukabayeva, T., Ahmad, Z., Karabayev, N., Baumuratova, D. & Ali, M. An intrusion detection system for multiclass classification across multiple datasets in industrial IoT using machine learning and neural networks integrated with edge computing. In: (eds Nayyar, A., Ling, T. W. & Leung, C.) Advances in Transdisciplinary Engineering. IOS, 10.3233/ATDE250012. (2025).
- 25.Ismail, S., Dandan, S. & Qushou, A. Intrusion detection in IoT and iiot: comparing lightweight machine learning techniques using TON_IoT, WUSTL-IIOT-2021, and EdgeIIoTset datasets. IEEE Access.10.1109/ACCESS.2025.3554083 (2025). [Google Scholar]
- 26.Bhat, S. et al. An integrated approach to network security: Combining cryptography and intrusion detection. in 2025 International Conference on Pervasive Computational Technologies (ICPCT) 223–228 (IEEE, 2025). 10.1109/ICPCT64145.2025.10939222
- 27.Qureshi, S. S. et al. Advanced AI-driven intrusion detection for Securing cloud-based industrial IoT. Egypt. Inf. J.30, 100644. 10.1016/j.eij.2025.100644 (2025). [Google Scholar]
- 28.Ferrag, M. A. F., Friha, O. F., Hamouda, D. H., Maglaras, L. M. & Janicke, H. J. Edge-IIoTset: A new comprehensive realistic cyber security dataset of IoT and IIoT applications: centralized and federated learning. IEEE DataPort. 10.21227/MBC1-1H68
- 29.Almiani, M., AbuGhazleh, A., Al-Rahayfeh, A., Atiewi, S. & Razaque, A. Deep recurrent neural network for IoT intrusion detection system. Simul. Model. Pract. Theory. 101, 102031 (2020). [Google Scholar]
- 30.Bajpai, S., Sharma, K. & Chaurasia, B. K. Intrusion detection framework in IoT networks. SN Comput. Sci.4(4), 350 (2023).
- 31.Ullah, S. et al. A new intrusion detection system for the internet of things via deep convolutional neural network and feature engineering. Sensors22(10), 3607 (2022). [DOI] [PMC free article] [PubMed]
- 32.Awajan, A. A novel deep learning-based intrusion detection system for IoT networks. Computers12 (2), 34 (2023). [Google Scholar]
- 33.Alghamdi, R. & Bellaiche, M. An ensemble deep learning based IDS for IoT using lambda architecture. Cybersecurity6 (1), 5 (2023). [Google Scholar]
- 34.Kasongo, S. M. & Sun, Y. Performance analysis of intrusion detection systems using a feature selection method on the UNSW-NB15 dataset. J. Big Data. 7, 1–20 (2020). [Google Scholar]
- 35.Kunang, Y. N., Nurmaini, S., Stiawan, D. & Suprapto, B. Y. Attack classification of an intrusion detection system using deep learning and hyperparameter optimization. J. Inf. Secur. Appl.58, 102804 (2021). [Google Scholar]
- 36.Zhao, R. et al. A novel traffic classifier with attention mechanism for industrial internet of things. IEEE Trans. Ind. Inf.10.1109/TII.2023.3241689 (2023). [Google Scholar]
- 37.Khan, I. A. et al. Fed-Inforce-Fusion: A federated reinforcement-based fusion model for security and privacy protection of IoMT networks against cyber-attacks. Inform. Fusion. 101, 102002. 10.1016/j.inffus.2023.102002 (2024). [Google Scholar]
- 38.Nandanwar, H. & Katarya, R. TL-BILSTM IoT: transfer learning model for prediction of intrusion detection system in IoT environment. Int. J. Inf. Secur.23, 1251–1277. 10.1007/s10207-023-00787-8 (2024).
- 39.Ishtiaq, W. et al. CST-AFNet: A dual attention-based deep learning framework for intrusion detection in IoT networks. Array10, 100501. 10.1016/j.array.2025.100501 (2025). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The dataset analyzed during this study is publicly available in IEEE DataPort under the title *Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications: Centralized and Federated Learning, 10.21227/MBC1-1H68.
The full implementation of the proposed framework, including both binary‑ and multiclass classification notebooks, is publicly available on GitHub at: https://github.com/ashrafunzannat/ASTRID-Net.

















