. 2022 Jan 27:1–37. Online ahead of print. doi: 10.1007/s00500-021-06608-1

Table 5.

The recent DL-based DDoS attack detection studies with their preprocessing strategies, hyperparameter values, experimental setups, and performance metrics

Taxonomy	References	Preprocessing strategies	Hyperparameter values	Experimental setups	Performance metrics
Supervised instance learning	Hasan et al. (2018)	–	The architecture consist of two convolutional layer followed by maxpooling layer, ReLu function, fully connected layer (250 neurons), ReLu function, dropout layer FC layer (four neurons). The stride size used is 1 X 1 for CL and 1 X 2 used for PL. Used back-propagation and SoftMax loss function	–	Accuracy $=$ 99%, Sensitivity $=$ 99%, Specificity $=$ 99%, Precision $=$ 99%, F1-score $=$ 99%, False positive rate $=$ 1%, and False negative rate $=$ 1%
	Amma and Subramanian (2019)	Min–max normalization	The pre-training module has two stages of training and each stage comprises of CL and PL. The filter size is 3 and the max pooling has size 2. Training Module: The FCNN comprises of input layer, two hidden layers, the output layer with 11-9-7-6 number of nodes, respectively, and the activation function is ReLu	–	Accuracy: Normal $=$ 99.3%, Back $=$ 97.8%, Neptune $=$ 99.1%, Smurf $=$ 99.2%, Teardrop $=$ 83.3%, Others $=$ 87.1%. Precision: Normal $=$ 99.6%, Back $=$ 95.9%, Neptune $=$ 97.9%, Smurf $=$ 91.1%, Teardrop $=$ 19.6%, Others $=$ 97.8%. Recall: Normal $=$ 99.3%, Back $=$ 97.8%, Neptune $=$ 99.1%, Smurf $=$ 92.2%, Teardrop $=$ 83.3%, Others $=$ 87.1%. F1-score: Normal $=$ 99.4%, Back $=$ 96.8%, Neptune $=$ 98.5%, Smurf $=$ 95.0%, Teardrop $=$ 31.7%, Others $=$ 92.1%. False Alarm: Normal $=$ 0.7, Back $=$ 2.2, Neptune $=$ 0.9, Smurf $=$ 0.8, Teardrop $=$ 16.7, Others $=$ 12.9. AUC: Normal $=$ 0.993, Back $=$ 0.978, Neptune $=$ 0.991, Smurf $=$ 0.992, Teardrop $=$ 0.833, Others $=$ 0.871
	Chen et al. (2019)	–	Used incremental training method to train MC-CNN	–	Accuracy: KDDCUP99 (2 class) $=$ 99.18%, KDDCUP99 (5 class) $=$ 98.54%, CICIDS2017 $=$ 98.87%
	Shaaban et al. (2019)	Features are converted into matrix form i.e., 8 and 41 features into 33 and 67 matrix using padding	The CNN model contains three stages and the first stage comprised of the input layer, two CLs, and the output from these layers is fed to PL. Second stage has two CLs and a PL. The third stage consists of a FC network and output layer. ReLu function has been used in all layers except the output layer that uses softmax function	Keras library and Tenser-flow library	Dataset1: Accuracy $=$ 0.9933, Loss $=$ 0.0067. Dataset2 (NSL-KDD): Accuracy $=$ 0.9924, Loss $=$ 0.0076
	Sabeel et al. (2019)	Z-score normalization	The input layer of the DNN/LSTM model has size of 25 followed by a dense/recurrent layer having 60 neurons and dropout of 0.2, FC dense layer having 60 neurons, a dropout rate is 0.2, another dense layer having 60 neurons. All layers have ReLu activation function. Then, a dense FC output layer used the sigmoid activation function. The learning rate is 0.0001 and batch size is set to 0.0001	GPU NIVDIA Quadro K2200, Intel (R) Xenon (R) with CPU E5-2630 v3@ 2.40GHz, 240GB SSD, 64-bit Windows 10 Pro 1809. Software: Python 3.7.3, TensorFlow1.13.0, keras 1.1.0, and NVIDIA Cuda Toolkit 10.0.130 with cuDNN 7.6.0	Accuracy $=$ 98.72%, TPR $=$ 0.998, Precision $=$ 0.949, F1-score $=$ 0.974, AUC $=$ 0.987
	Virupakshar et al. (2020)	–	–	Controller: 1 GB RAM, 50 GB Memory, 2 (No. of core) Processor. Neutron: 1 GB RAM, 20 GB Memory, 2 (No. of core) Processor. Computer-1: 1 GB RAM, 20 GB Memory, 2 (No. of core) Processor. Computer-2: 1 GB RAM, 20 GB Memory, 2 (No. of core) Processor	KDDCUP99: Recall $=$ 0.99, F1-score $=$ 0.98, Support $=$ 2190. LAN Dataset: Recall $=$ 0.91, F1-score $=$ 0.91, Support $=$ 2140. Cloud Dataset: Recall $=$ 0.91, F1-Score $=$ 0.91, Support $=$ 2138, Precision $=$ 96%
	Haider et al. (2020)	Z-score normalization	The ensemble CNN model (M1 and M2) contains three 2-d CLs (with 128, 64, & 32 filters, respectively), 2 max PLs, 1 layer to flatten, and 2 dense FC layers. ReLu as activation function in hidden layers and Sigmoid at the output layer	System Manufacturers : Lenovo, Processor: Intel Core i7-6700 CPU with 3.4 GHz Processor, Memory 8GB, Operating System: Microsoft Windows 10. Software: Keras Library with TensorFlow	Accuracy $=$ 99.45%, Precision $=$ 99.57%, Recall $=$ 99.64%, F1-score $=$ 99.61%, Testing time $=$ 0.061 (minutes), Training time $=$ 39.52 (minutes), CPU Usage% $=$ 6.025
	Wang and Liu (2020)	Each byte of a packet is converted into a pixel and gathered as a picture	The model includes two CLs, two PLs, and two FC layers. For Information entropy the threshold is selected as 100 Packets/s	Mininet emulator, POX controller and a PC with Inter Core i5- 7300HQ CPU, 8GB RAM, and Ubuntu 5.4.0-6 system. The experimental topology comprises of six switches, one server and a controller. Software: Hping3, TensorFlow framework	Accuracy $=$ 98.98%, Precision $=$ 98.99%, Recall $=$ 98.96%, F1-score $=$ 98.97%, and Training time $=$ 72.81s, AUC $=$ 0.949
	Kim et al. (2020)	One-hot encoding and 117 features are converted into images with 13 9 pixels and 78 features to 13 6	The model comprises of 1, 2, or 3 CLs, and the number of kernels is set to 32, 64 and 128, respectively. In addition, the kernel size is set to 2 2, 3 3, and 4 4. The stride value is set to 1	Python with TensorFlow	Accuracy: KDDCUP99 $=$ 99%. CSE-CIC-IDS2018 $=$ 91.5%
	Doriguzzi-Corin et al. (2020)	Min–max normalization and converts the traffic flows into array-like data structures and splits them into sub-flows based on time windows	Used $n = 100$ , $t = 100$ , $k = 64$ , $h = 3$ , $m = 98$ (n is the maximum number of packets, the time window of length t seconds, a single CL with k filters of size h f, h is the length of each filter, and f is the no. of features, m is pool size). The model has batch size $s = 2048$ with the Adam optimizer and learning rate $=$ 0.01. The output layer uses the sigmoid activation function	Used a server-class computer equipped with two 16-core Intel Xeon Silver 4110 @2.1 GHz CPUs and 64 GB of RAM. Software: Python v3.6 using the Keras API v2.2.4 on top of TensorFlow 1.13.1	ISCX2012: Accuracy $=$ 0.9888, FPR $=$ 0.0179, Precision $=$ 0.9827, Recall $=$ 0.9952, F1-score $=$ 0.9889. CICIDS2017: Accuracy $=$ 0.9967, FPR $=$ 0.0059, Precision $=$ 0.9939, Recall $=$ 0.9994, F1-score $=$ 0.9966. CSECIC2018: Accuracy $=$ 0.9987, FPR $=$ 0.0016, Precision $=$ 0.9984, Recall $=$ 0.9989, F1-score $=$ 0.9987. UNB201X: Accuracy $=$ 0.9946, FPR $=$ 0.0087, Precision $=$ 0.9914, Recall $=$ 0.9979, F1-score $=$ 0.9946
	Asad et al. (2020)	Min–max scaling and Cost sensitive learning technique	The model with the input (66 neurons), output layer (5 neurons) and the seven hidden layers (with 128, 256, 128, 64, 32, 16, 8 number of neurons, respectively). The batch normalization with the batch size of 1024, ReLu function with dropout rate 0.2 used in each hidden layer. The learning rate is 0.001 and number of epochs are 300	CPU Platform: 2.5 GHz Intel Xeon E5 v2, GPU: NVIDIA Tesla K80, CPU Cores: 4, GPU Memory: 24 GB of GDDR5, RAM: 26 GB	Accuracy $=$ 98%, F1-score $=$ 0.99 and AUC $\approx$ 1
	Muraleedharan and Janet (2020)	-	The input and output layer has 80 and 5 number of neurons. The output layer and four hidden layers used Softmax and ReLu activation function, respectively. Also used the adam optimization algorithm and categorical cross-entropy function	Keras API and SciKit	Accuracy $=$ 99.61%, Precision: Benign $=$ 0.99, Slowloris $=$ 1.00, Slowhttptest $=$ 0.99, Hulk $=$ 1.00, GoldenEye $=$ 1.00. Recall: Benign $=$ 1.00, Slowloris $=$ 0.99, Slowhttptest $=$ 0.98, Hulk $=$ 1.00, GoldenEye $=$ 1.00. F1-score: Benign $=$ 1.00, Slowloris $=$ 0.99, Slowhttptest $=$ 0.99, Hulk $=$ 1.00, GoldenEye $=$ 1.00
	Sbai and El Boukhari (2020)	–	–	–	Precision $=$ 0.99, Recall $=$ 1, F1-score $=$ 0.99, Accuracy $=$ 0.99997
	de Assis et al. (2020)	The qualitative dimensions are converted to quantitative using Shannon Entropy	The CNN model is composed of a stack of two Conv1D (with 16 and 8 filters) and MaxPooling1D (with a pool size of 2) layers followed by a Flatten layer, a Dropout layer (Dropout rate of 0.5), and a FC layer (with 10 neurons). The output has a neuron with sigmoid activation function. The model used 1000 epochs	A computer with Windows 10 64 bit, Intel Core i7 2.8GHz, and 8GB of RAM. Software: Python and Keras	Simulated SDN data: Accuracy $=$ 99.9% (On average), Precision $=$ 99.9% (On average), Recall $=$ 99.9% (On average) and F-measure $=$ 99.9% (On average). CICDDoS 2019: Accuracy $=$ 95.4%, Precision $=$ 93.3%, Recall $=$ 92.4% and F-measure $=$ 92.8%
	Hussain et al. (2020)	Used min–max normalization and converted the samples matrices to images using the OpenCV library. The images of dimension 60 x 60 x 3 is converted into 224 x 224 x 3	The ResNet18 model consists of 10 CLs and 8 PLs. The outputs set for binary and multiclass classifications are as 1 and 12, respectively. The leaning rate $=$ 0.0001, momentum $=$ 0.9, epochs for binary classification $=$ 10 , epochs for multi-class classification $=$ 50 and SGD optimizer have been used	–	Multiclass: Precision $=$ 87%, Recall $=$ 86%, Accuracy $=$ 87.06% and F1-measure $=$ 86. Binary: Accuracy $=$ 99.99%
	Amaizu et al. (2021)	Min–max scaling function	No. of hidden layers $=$ 7, Activation function $=$ ReLu, Dropout Layers $=$ 2, Learning rate $=$ 0.001, Loss function $=$ SCC and epochs $=$ 50	One server, one firewall, two switches, and four PC. Software: Keras Sequential API and the Keras Functional API, Keras-tuner Library	Recall $=$ 99.30%, Precision $=$ 99.52%, F1-score $=$ 99.99%, Accuracy $=$ 99.66%
	Cil et al. (2021)	Min–max normalization	The DNN model comprises of three hidden layers having 50 units of neurons and sigmoid activation function. The output layer has two neurons with softmax activation function	The computer with Windows 10 OS, Intel Core i7-7700, CPU 4.2 GHz processor, 32 GB RAM, 2X512GB SSD and NVIDIA GTX 1080 Ti Graphics Coprocessor. Software: Python 3.7 and deep learning libraries	Dataset1: Accuracy $=$ 0.9997, Precision $=$ 0.9999, Recall $=$ 0.9998, F1-Score $=$ 0.9998. Dataset2: Accuracy $=$ 0.9457, Precision $=$ 0.8049, Recall $=$ 0.9515, F1-Score $=$ 0.8721
Supervised sequence learning	Li et al. (2018)	BOW and the 2-D feature matrix is transformed to a 3-D matrix	A DL model comprises of an input, forward recursive, reverse recursive, FC hidden, and output layers	Two NVIDIA K80 GPU and 128 GB memory. Software: Ubuntu 14.04, Keras, and Spirent contracting tools	Accuracy $=$ 98%
	Priyadarshini and Barik (2019)	One-hot encoding	LSTM uses two hidden layers with 128 neurons, and Sigmoid function, the output layer uses a tanh function and the loss function $=$ binary cross-entropy, Adams’ optimizer with a dropout rate $=$ 0.2, the mini-batch size $=$ 512 iterations	PHP, MySQL as pre- requisite, Cent OS7, MariaDB, Apache server, Linux, windows OS, HPing-3, Mininet emulator, FloodLight controller, Python library Keras and TensorFlow	Accuracy $=$ 98.88%
	Liang and Znati (2019)	For each network flow, F, a subsequence of n packets, $S \subset F$ , is inspected. If there is not enough packets in a flow then it is padded with fake packets	The model has two LSTM layers, a dropout layer, and a FC layer. A sequence of 10 packets from each flow has been taken	–	CICIDS2017 (Wednesday): Precision $=$ 0.9995, Recall $=$ 0.9997, F1-score $=$ 0.9991. CICIDS2017 (Friday): Precision $=$ 0.9998, Recall $=$ 1, F1-score $=$ 0.9999
	Shurman et al. (2020)	One hot encoder and RF for feature selection	The model comprises of three LSTM layers with 128 neurons and sigmoid function, three dropout layers, and a dense layer with tanh function. The model used categorical cross-entropy loss function and RMS propagation as an optimizer	–	Accuracy $=$ 99.19%
	Assis et al. (2021)	Used MD5 hashing process to convert qualitative dimensions into quantitative values	GRU layer ( $C = 32$ ), followed by a dropout layer having dropout rate 0.5, and a FC layer with ten neurons. The output layer has a neuron with sigmoid function	A computer using Windows 10 64 bit, Intel Core i7 2.8 GHz, and 8 GB of RAM. Software: Python, Keras and Sklearn	CICDDoS2019: Average metrics (Accuracy, Precision, Recall, and F-measure) $=$ 99.94%, legitimate flow classification rate $=$ 99.6%. CICIDS2018 dataset: Accuracy $=$ 97.1%, Precision $=$ 99.4%, Recall $=$ 94.7%, and F-measure $=$ 97%, legitimate flow classification rate $=$ 99.7%. Mitigation Evaluation Outcomes: i. the absolute number of normal flows dropped, ii) the absolute number of attack flows not dropped. CICDDoS2019: i. 188 ii. 48. CICIDS2018 dataset: i. 2660 ii. 48636
Semi-supervised instance learning	Catak and Mustacoglu (2019)	Normalization model	AE model consisted of the input layer, three hidden layers, and output layer with unit numbers 28, 19, 9, 19 and 28 with sigmoid as an activation function. DNN comprises of the input layer, five hidden layers with unit numbers 28, 500, 800, 1000, 800, and 500, respectively. Mini-batch SGD optimization, and binary cross-entropy loss function have been used	A CPU and a NVIDIA Quadro 1000M GPU. Software: Python 3.5 with 64 bits, Keras, TensorFlow and SciKit-learn libraries of Python, Windows 7 with 64 bits	UNSWNB15: F1-score $=$ 0.8985, Accuracy $=$ 0.9744, Precision $=$ 0.8924, Recall $=$ 0.9053. KDDCUP99: Overall Accuracy and Precision $\approx$ 99%
	Ali and Li (2019)	Features that are not numbers are discretized	The number of MSDA used are 9 in the experiments with the number of layers selected as $L = [1, 3, 5, 7, 9, 11]$	A computer with 32.5 GB memory and NVIDIA Tesla V100 GPUs Software: MATLAB	Average Accuracy for Dataset D1 to D16 $=$ 93% and on D2 $=$ 97%
	Yang et al. (2020)	The flow is divided into many subflows according to threshold value of 10 ms	AE model has one input layer, three hidden layers, and one output layer. The neurons number in each layer is 27-24-16-24-27, respectively. The leaky ReLu activation function, Adam optimizer, MSE are used and Batch size is set to 32	–	Exp1: SYNT: DR (Detection Rate) $=$ 98.32%, FPR $=$ 0.38%. UNB2017: DR $=$ 94.10%, FPR $=$ 1.88%. Exp2: SYNT: DR $=$ 100%, FPR $=$ 100%. UNB2017: DR $=$ 94.14%, FPR $=$ 1.91%. Exp2: Testset1: DR $=$ 100%, FPR $=$ 0.49%. Testset2: DR $=$ 99.99%, FPR $=$ 0.49%
	Kasim (2020)	Label encoding and Min–max normalization	AE parameters: Input neurons: 82, Output neurons: 82, Hidden neurons: 25, Learning rate: 0.3, Momentum: 0.2, SVM Parameters: It has 25 input and 2 output nodes. The learning rate is 0.01 and the number of iterations are 1000	A computer with Intel (R) Core (TM) i7–2760 QM CPU with a frequency of 2.40 GHZ and 8 GB of RAM. Software: Rest API and Python with keras, scapy, TensorFlow and SciKit libraries	1. Training Time $=$ 2.03s, Testing Time $=$ 21 ms, Accuracy on CICIDS2017 $=$ 99.90%. 2. Created DDoS attacks: Accuracy $=$ 99.1%. AUC $=$ 0.9988. 3. NSL-KDD test: Accuracy $=$ 96.36%
	Bhardwaj et al. (2020)	One hot encoding and min–max normalization	AE model: Two encoding layers $=$ 70 and 50 neurons, coding layer $=$ 25 neurons, and two decoding layers $=$ 25 neurons and all layers with ReLu activation. The output layer has sigmoid activation, and the optimizer used is Adadelta. DNN: It has 20, 12 neurons in two hidden layers, and the optimizer is Adabound	PC with Windows 10-64 bits and 16 GB RAM and CPU Intel(R) Core-i7, and VMware workstation	NSL-KDD: Accuracy $=$ 98.43%, Precision $=$ 99.22%, Recall $=$ 97.12%, F1-score $=$ 98.57% CICIDS2017: Accuracy $=$ 98.92%, Precision $=$ 97.45%, Recall: 98.97%, F1-score $=$ 98.35%
	Premkumar and Sundararajan (2020)	–	–	No. of nodes: 200, Simulation time: 500s, the number of attacking nodes $=$ 5 to 20% of the normal nodes and Constant Bit Rate (CBR) application is used	Attackers between 5 and 15%, detection ratio is 86–99%, the false alarm rate $=$ 15%
Hybrid Learning	Roopak et al. (2019)	–	CNN $+$ LSTM model has 1d CNN layer with ReLu function, followed by a LSTM layer with adam activation function, a dropout layer having rate of 0.5, a FC layer and a dense layer with a sigmoid function	A PC with 64-bit Intel Core-i7 CPU with 16 GB RAM in Windows 7. Software: Keras on TensorFlow package for DL and MATLAB 2017a for ML algorithm	Accuracy $=$ 97.16%, Recall $=$ 99.1% and Precision $=$ 97.41%
	Li and Lu (2019)	BOW and feature hashing. The 2-d matrix converted into a three-dimensional matrix	LSTM module consists of two hidden, a FC layers of 256 neurons with ReLU activation function, and a FC layer of 1 neuron with Sigmoid activation function	NVIDIA GTX 1050 GPU	Accuracy $=$ 98.15%, Precision $=$ 98.42%, Recall $=$ 97.6%, TNR $=$ 98.4%, FPR $=$ 1.6%, F1-Score $=$ 98.05%
	Roopak et al. (2020)	Min–max normalization	The 1d-CNN is followed by maxpooling, LSTM, and dropout layers with Relu activation function. The output layer classifies using a sigmoid function with binary cross-entropy. The dropout rate is 0.2, learning rate is 0.001, batch size is 256 and epochs are 100.	GPU NVIDIA Tesla VIOO GPUs, having 16 GB VRAM with 256 GB on 10 number of nodes in HPC. Software: Keras on TensorFlow	Precision $=$ 99.26%, Recall $=$ 99.35%, Accuracy $=$ 99.03%, F-measure $=$ 99.36%, Training time $=$ 15313.10 s
	Elsayed et al. (2020)	Min–max normalization	The RNN-AE consists of four RNN hidden layers. The encoder phase has the number of channels equal to 64, 32, 16, and 8 and the decoder phase has in the reverse order of it. The last layer has two channels with softmax function. Other parameters are categorical cross-entropy as loss function with adam optimizer, ReLu function in all layers, with 50 number of epochs and a batch size is 32, learning rate is 0.0001	–	Precision: Attack $=$ 0.99, benign $=$ 1.00. Recall: Attack $=$ 0.99, benign $=$ 0.99. F1-score: Attack $=$ 0.99, benign $=$ 0.99. Accuracy: 99%. AUC $=$ 98.8
	Nugraha and Murthy (2020)	Min–Max Scaler	Three layers between the CNN and the LSTM layer i.e. dropout, maxpool, and flatten layers. Then LSTM layer is followed by a FC dense layer having ReLu function, dropout layer, and the last dense layer having sigmoid function. The learning rate $=$ 0.0005, dropout rate $=$ 0.3, and CNN filter $=$ 64 and kernel size $=$ 5 and the number of epochs is set to 50	Python	Accuracy $=$ 99.998%, Precision $=$ 99.989%, Specificity $=$ 99.997%, Recall $=$ 100%, F1 score $=$ 99.994%
Transfer learning	He et al. (2020)	–	8LANN consists of eight FC layer. Each layer except the eighth layer are followed by batch normalization and ReLu function. The batch size is 500, cross-entropy loss function, SGD optimizer and the learning rate $=$ 0.001 have been used	Ubuntu 16.04 64 bit OS with 64 GB of the memory. The GPU accelerator is NVIDIA RTX 2080Ti. Software: PyTorch	Detection performance $=$ 87.8%, Transferability value $=$ 19.65