An augmented Snake Optimizer for diseases and COVID-19 diagnosis

Ruba Abu Khurma; Dheeb Albashish; Malik Braik; Abdullah Alzaqebah; Ashwaq Qasem; Omar Adwan

doi:10.1016/j.bspc.2023.104718

. 2023 Feb 17;84:104718. doi: 10.1016/j.bspc.2023.104718

An augmented Snake Optimizer for diseases and COVID-19 diagnosis

Ruba Abu Khurma ^a, Dheeb Albashish ^b,^⁎, Malik Braik ^b, Abdullah Alzaqebah ^c, Ashwaq Qasem ^d, Omar Adwan ^a,^e

PMCID: PMC9935299 PMID: 36811003

Abstract

Feature Selection (FS) techniques extract the most recognizable features for improving the performance of classification methods for medical applications. In this paper, two intelligent wrapper FS approaches based on a new metaheuristic algorithm named the Snake Optimizer (SO) are introduced. The binary SO, called BSO, is built based on an S-shape transform function to handle the binary discrete values in the FS domain. To improve the exploration of the search space by BSO, three evolutionary crossover operators (i.e., one-point crossover, two-point crossover, and uniform crossover) are incorporated and controlled by a switch probability. The two newly developed FS algorithms, BSO and BSO-CV, are implemented and assessed on a real-world COVID-19 dataset and 23 disease benchmark datasets. According to the experimental results, the improved BSO-CV significantly outperformed the standard BSO in terms of accuracy and running time in 17 datasets. Furthermore, it shrinks the COVID-19 dataset’s dimension by 89% as opposed to the BSO’s 79%. Moreover, the adopted operator on BSO-CV improved the balance between exploitation and exploration capabilities in the standard BSO, particularly in searching and converging toward optimal solutions. The BSO-CV was compared against the most recent wrapper-based FS methods; namely, the hyperlearning binary dragonfly algorithm (HLBDA), the binary moth flame optimization with Lévy flight (LBMFO-V3), the coronavirus herd immunity optimizer with greedy crossover operator (CHIO-GC), as well as four filter methods with an accuracy of more than 90% in most benchmark datasets. These optimistic results reveal the great potential of BSO-CV in reliably searching the feature space.

Keywords: Snake Optimizer, Feature selection, COVID-19, Transfer function, Greedy crossover

1. Introduction

The volume of medical data is steadily expanding daily to keep up with the rapid changes in medical equipment. Nowadays, machine learning and data science techniques play a vital role in medical diagnosis, particularly in discriminating between various forms of cancer. This diagnosing task is considered a classification task in machine learning, aiming to classify the input medical data into several discrete cases (i.e., benign and malignant). Many of the medical domain’s collected features are relevant, redundant, noisy, or irrelevant to classification tasks. Using irrelevant, noisy, and redundant features degrades classification model performance in medical diagnosis. As a result, the final decision in this domain becomes shaky and untrustworthy [1]. Therefore, it is necessary to pick only the proper features on which to perform the learning model. This will boost the effectiveness of the classifier’s output while reducing the learning model’s time-consuming, particularly when dealing with large datasets. In machine learning, FS methods are considered essential preprocessing algorithms for optimizing the efficiency of the classification methods by identifying a meaningful pattern to support the final judgment of the classifiers.

The FS task in the medical domain essentially involves devising a procedure to obtain a suitable subset of features from the original big dataset (e.g., all features). This subset includes features crucial to the current problem while excluding unnecessary or redundant features. If all features are utilized for the classification of medical tasks, the learning model will become porn to overfitting issues due to the curse of dimensionality. Thus, ultimate performance will suffer either in terms of accuracy or time-consuming [2]. Therefore, the primary purpose of the FS algorithms relies on two objectives: generating a smaller version of the original dataset by selecting the most relevant features and excluding irrelevant and redundant features. Meanwhile, improving classification performance [3], [4]. Implementing the FS process has a significant effect in avoiding the curse of dimensionality, which makes the learning methods less likely to overfit [5].

Typically, the feature selection process is divided into five stages [5]: initialization, subset discovery/search, subset evaluation, stopping criterion, and final subset validation. The subsets of features are generated in subset discovery, where each subset is chosen from the whole set of features (all dataset features). The search approach, in particular, explores the search space to identify the optimal feature subset. In reality, each feature in the new subset is checked for eligibility using a forward or backward elimination procedure. The quality of the selected features is assessed using a subset evaluation function. For this assignment, the majority of the FS approaches employ a predictive model with a suitable fitness function (i.e., accuracy). The halting condition is utilized to prevent the FS techniques from becoming trapped in an indefinite loop. Most stopping conditions include the maximum number of iterations as a predefined parameter [6].

FS algorithms can be divided into four classes based on using the evaluation methods: filter, embedded, wrapper and hybrid-based methods [6]. Filter-based FS approaches leverage statistical assessment metrics to rank features. Each feature is granted a score based on the designated metric (for example, information gain (IG) and F-score). Then, each feature is ranked based on its obtained score (ascending or descending). The high-score features are considered the most effective in the current domain. Generally, the filter-based methods have no real interaction with the classifier (predictive) model. As a result, they are faster than the wrapper and embedded methods. Many filters have been adopted in the literature, such as ReliefF [7], mutual information, absolute cosine (AC), and mRMR [8]. The second type is embedded methods, where the FS process is integrated into the classifier learning to become a single process, such as SVM-recursive feature elimination (SVM-RFE) [9]. The hybridization of the embedded feature selection with the filter method can be found in our latest study, where we combined the AC with the SVM-RFE to handle the redundancy in the SVM-RFE(SVM(AC)) [1].

In contrast to the filter method, wrapper methods make use of a predictive model (i.e., K-Nearest-Neighbor (KNN)) as part of the assessment phase to evaluate the fitness value of the acquired feature subset. The wrapper-based approach finds an appropriate subset (i.e., solution) for the current task. However, because the total number of potential solutions is $2^{n}$ , where n is the number of dimensions, it is difficult to find a near-optimal subset of features in terms of objective fitness due to the vast search space. In addition, this problem has become more complicated when n is increasing dramatically in many fields due to the data collection phase. Thus, the complexity of those problems is increased. This indicates that standard brute force techniques are unfeasible and that advanced search techniques should be utilized instead. Hence, one of the promising techniques that can be used for those problems is Meta-heuristic Algorithms (MAs).

MAs are intelligent algorithms that involve mathematical operations and make several efforts to identify an optimal solution from a set of random solutions with the assistance of the learning model for a particular task [10]. MAs compute either a single or multiple objectives to select the optimal solution. To be precise, MAs use the information obtained during the search to guide the optimization process. They usually merge numerous solutions to generate a highly proficient one, e.g., crossover in Genetic Algorithm (GA) and avoid getting stuck in local minima. While the MAs search for the optimal solution, they usually perform two stages of search: exploration and exploitation [11]. During the exploration stage, the investigation covers a variety of environments to identify more locations for high-quality solutions. In contrast, at the exploitation stage, available resources are focused on a specific search location. The main challenge for MAs is to strike a balance between exploration and exploitation [3].

Using MAs with FS problems is considered a multi-objective task, where the primary goal is to preserve a minimum number of selected features while improving the classification performance. However, these two objectives are contradictory, and the optimal decision should be determined by making a trade-off between them. Recently, numerous MAs have been adopted for FS in medical classification and diagnosis apps. They are utilized in wrapper or hybrid wrapper-filter approaches. These include Moth Flame Optimization (MFO) [12], Coronavirus Herd Immunity Optimizer (CHIO) [2], Particle Swarm Optimization (PSO) [11], [13], Rat Swarm Optimizer (RSO) [14], and Mine Blast Algorithm (MBA) [15]. Continuing with the advantages of MAs in FS problems, we chose Snake Optimizer (SO) as one of the most recent MAs to achieve. The SO is a newly invented, continuous, nature-inspired method that mimics the snakes’ mating and fighting behaviors. SO includes mating and fighting modes. The former occurs based on cold temperatures. While in the latter, the snakes fight until the male gets the female and the female gets the best male. If no food is encountered, the exploration stage is started to search for food. In contrast, if the snakes eat the food, this is one case of exploitation.

SO has some particular aspects over MAs, first, it has a novel natural inspiration. This is the first time to propose the mating behavior of snakes for solving optimization problems. Second, the experimental results and statistical comparisons prove the effectiveness and efficiency of SO on different landscapes concerning exploration–exploitation balance and convergence curve speed [16]. Third, it has high stability and good convergence and is simple to implement and parameter-less [17].

However, many optimization tasks (e.g., feature selection) include discrete search space and decision variables. Besides, updating the population impacts the population’s diversity; as a result, the exploration stage needs to be improved to fully explore the search space [15], [18].

In this study, authors strive to exploit the swarm-based SO algorithm to build a wrapper-based approach to address several medical classification problems. This depends on nominating the most valuable and informative medical features in a specific dataset that are required for generating the best medical classification model with higher performance, less number of features, and less running time. Increasing the effectiveness of medical models using AI tools can be considered as an alternative to physicians with less cost and side effects on patients. Therefore, SO was adopted to search in the feature space for the best feature subset. Since SO was developed to deal with continuous optimization problems and it has never been used before to deal with discrete search space, in the first experiments authors generated a new binary version of SO called BSO using a common and widely used S-shape transform function. The BSO was validated by examining its performance using several evaluation measures such as accuracy, sensitivity, specificity, fitness value, number of selected features, running time, conversion curves, box plots, convergence speed, Holm’s test, and Friedman’s test. In the second experiment, new evolutionary greedy crossover operators (GC) (i.e., one-point, two-point, and uniform crossovers) are proposed to be integrated with SO to enhance its explorative power in the feature space. These are controlled by a switch probability. The two newly developed FS algorithms, BSO and BSO-CV, are implemented and assessed on a real-world COVID-19 dataset and 23 disease benchmark datasets. According to the experimental results, the improved BSO-CV significantly outperformed the standard BSO in terms of accuracy and running time in 17 datasets. Furthermore, it shrinks the COVID-19 dataset’s dimension by 89% as opposed to the BSO’s 79%. Moreover, the adopted operator on BSO-CV improved the balance between exploitation and exploration capabilities in the standard BSO, particularly in searching and converging toward optimal solutions. The BSO-CV was compared against the most recent wrapper-based FS methods; namely, the hyperlearning binary dragonfly algorithm (HLBDA), the binary moth flame optimization with Lévy flight (LBMFO-V3), the coronavirus herd immunity optimizer with greedy crossover operator (CHIO-GC), as well as four filter methods with an accuracy of more than 90% in most benchmark datasets. These optimistic results reveal the great potential of BSO-CV in reliably searching the feature space.

The manuscript’s structure is organized as follows: Section 2 provides a review of some related works. The theoretical and mathematical background of SO is presented in Section 3. In Section 4, the details of the new BSO and BSO-CV methods for FS are outlined. Then, in Section 5, the obtained results and related comparisons are reported and discussed. Followed by the computational statistical test analysis in Section 6. Finally, the conclusion and several recommendations for future research are illustrated in Section 7.

2. Literature review

Medical applications are a critical and crucial research area for machine learning scientists. Recently, many studies that exploit artificial intelligence and data science techniques have assisted in developing medical models. This depends on medical images, patient medical files, and other features to predict disease occurrence at an early stage [5].

2.1. Evolutionary feature selection for disease diagnosis

This subsection sheds light on the recent research in the field of medical applications that have developed evolutionary FS models to support physicians. In [19], the authors developed a new model for the early prediction of diabetes. The new model used Grey Wolf Optimization (GWO) and Adaptive Particle Swam Optimization (APSO) to improve the Multilayer Perceptron (MLP). They were able to reduce the number of selected features and achieve high-performance results. GWO-MLP and APGWO-MLP obtained an accuracy of 96% and 97% respectively.

Mazaher et al. [20], developed a new computer-aided diagnosis (CAD) system to detect different types of cardiac arrhythmia disorder using the ElectroCardioGram (ECG) signal. After the preprocessing steps, different features of the ECG signals were segmented and analyzed. Several metaheuristic algorithms were used in combination with the selected features. The best results were obtained using a multi-objective optimization algorithm called the Non-dominated Sorting Genetic Algorithm (NSGA II). The feed-forward neural network accuracy for heart disease classification was 98.75%. In [21], the authors proposed a new model-based Marine Predator Algorithm (MPA) to extract the most significant feature subset to enhance the classification accuracy using the k-Nearest Neighbors (k-NN). The MPA-KNN was applied to 18 medical datasets and achieved the best results compared with other compared with meta-heuristic algorithms.

The Moth Flame Optimization algorithm (MFO) [22] was one of the optimization algorithms used in developing FS approaches to handle medical diagnosis [12], [18], and [23]. In [12], Khurma et al. generated eight binary MFO versions using eight transfer functions. Then, they applied the Levy flight operator in combination with transfer functions to increase the diversity of the algorithm and support the exploration of the search space. The proposed approach achieved an accuracy of 83% over 23 datasets. In [18], an FS model based on the Moth Flame optimization algorithm (MFO) was proposed. The performance of MFO was improved by adopting an adaptive method to update the position of a solution. The proposed MFO was tested on sixteen medical datasets, and the results showed promising classification results. Another study [23] proposed the MFO using Levy flight and different selection mechanisms: random selection, tournament selection, and roulette wheel selection methods to decrease the bias of the MFO algorithm toward exploitation. The proposed methods were tested using 23 medical data sets. Their results showed an enhanced behavior of MFO in the exploration, convergence, and diversity of solutions.

Dhanusha et al. [24], proposed a new model for Alzheimer’s disease (AD) based on the imaging data and clinical profile. The memetic metaheuristic model was called the Chaotic Shuffled Frog Leaping Algorithm (CSFLA). It used chaotic mapping when the solution in the search space obtained the worst result. CSFLA [24] was a simple model with few parameters and generated smaller subsets of features, less computation time, and best performance compared with other algorithms in the deep neural network.

Jaddi et al. [25], employed the Cell Separation Algorithm (CSA) for cancer classification based on applying feature selection to microRNA data. The authors enhanced the movement of virtual cells in the CSA to achieve a balance between global and local search. The improved CSA (I-CSA) was tested using 22 classifiers on 25 test functions and four general biological classification problems, and an experiment for feature selection from microRNA data was performed. The accuracy of each cancer type was also compared with the accuracy of 77 classifiers reported in previous studies. The proposed approach obtained 100% accuracy in 25 out of 29 classes.

Abouelmagd et al. [26] applied the Coral Reefs Optimization (CRO) algorithm for FS of breast cancer. This was based on using five classifiers. The algorithm achieved an accuracy of 100% in four algorithms and 99.1% using one classifier. In the study performed by Alweshah [2], an FS approach was applied to determine the most informative subset of features for several medical problems. The Coronavirus Herd Immunity Optimizer (CHIO) was used with and without a Greedy Crossover (GC) operator to improve the exploration of the CHIO. The CHIO and CHIO-GC were applied to 24 medical datasets. The results show that CHIO-GC was better than CHIO in terms of accuracy, the number of selected features, F-measure, and convergence speed. The CHIO-GC obtained an accuracy of 79% on medical benchmark datasets and an accuracy of 93% on the COVID-19 dataset.

Kanya [27], developed a new CAD to use mammogram images for early detection of breast cancer. The authors applied feature extraction, selection, and other preprocessing steps. They proposed the Weighted Adaptive Binary Teaching Learning Based Optimization (WA-BTLBO) and XGBoost classifier. The experiments showed high-accuracy results in classifying mammogram images as normal or abnormal.

2.2. Machine learning techniques for tackling COVID-19: Background

Dey [28], proposed a hybrid model that was applied in two stages. The first stage fine-tuned the parameters of the Convolutional Neural Networks (CNNs) to get the features from the COVID-19 patient’s infected lungs. In the second stage applied the Manta Ray Foraging-based Golden Ratio Optimizer (MRFGRO) to select the most informative feature subset. The proposed model achieved a classification accuracy of 99.15%, 99.42%, and 95.57% on three COVID-19 datasets, respectively.

Aslan [29] presented a classification model that extracted features using CNN in a study. Furthermore, it identified the hyperparameters of algorithms by Bayesian Optimization. The main contribution of this study was using Artificial Neural Networks (ANNs) for lung image segmentation. Also, it classified the chest images computed from the COVID-19 Radiography Database. Using classifiers together with the best hyperparameters could produce optimizing results. The best-achieved result was 96.29% using SVM. In [30], an evolutionary and deep learning algorithm and an advanced interpretation model were combined into one framework to help clinical decision-makers in dealing with different pandemic cases promptly. The feature selection stage was implemented using a genetic algorithm, A deep artificial neural network achieved an AUC of 0.883.

Bandyopadhyay et al. [31], proposed two stages of methods that apply feature extraction and feature selection for detecting COVID-19 from CT scan images. For feature extraction, the CNN DenseNet architecture was used. Harris Hawks Optimization (HHO), Simulated Annealing (SA), and Chaotic maps were combined to perform feature selection. The method was applied to the SARS-COV-2 CT-Scan dataset and the achieved accuracy result was 98.42 Deniz et al. [32], used the genetic algorithm and the Extreme Learning Machines (MG-ELM), a multi-threaded genetic feature selection algorithm, to predict the risk level of COVID-19 patients. The authors studied the effects of multi-threaded genetic algorithm implementation with statistical analysis. To verify the efficiency of MG-ELM, they compared their results with traditional and more recent techniques. The proposed algorithm outperformed other algorithms in terms of prediction accuracy. Kurnaz et al. [33], applied an FS approach using the Crow Learning Algorithm and ANN. The FS was used to select the relevant features for COVID-19 disease. The experiments were applied to the COVID-19 disease dataset in a Brazilian hospital. The experimental results showed that the accuracy was 94.31%.

In the study conducted by Kukker et al. [34], reinforcement learning was applied to determine COVID-19 using chest X-ray images. The author used the JAYA-Optimization algorithm, Wavelet Transform, feature extraction, and Principal Component Analysis feature reduction technique on X-ray images. The obtained accuracy of the COVID-19 prediction using the proposed method was 87.75%.

Ragab et al. [35], used the ensemble method for the detection of COVID-19. In addition, Gaussian filtering was used to eliminate noise and enhance image quality. Furthermore, a Shark Optimization Algorithm (SOA) with Recurrent Neural Networks (RNN) was applied to extract features. An Improved Bat Algorithm with a Multiclass Support Vector Machine (IBA-MSVM) was used for CT scan classification. The results showed promising classification performance over other approaches.

In [36], the authors introduced a novel HyperLearning Binary Dragonfly Algorithm (HLBDA) to select the most promising features from the COVID-19 dataset. The results showed that the HLBDA achieved higher results than other related algorithms on the same dataset. In [37], the Ensemble Support vector machine with Ludo Game-based Swarm Algorithm (ESLGSA) was used for the COVID-19 prediction from the CT and X-ray images. The proposed approach reduced the physical labeling of the images. The accuracy results were 99.64% while the AUC was 0.9257.

According to a recent study [38], the authors utilized PSO with a convolutional neural network (PSTCNN) to discover COVID-19 using chest computed tomography (CT) medical images. In more detail, the authors use PSO to perform self-tuning for the CNN’s hyperparameters to improve diagnosis performance. The proposed (PSTCNN) achieved an accuracy value of 93.99%±1.78% for binary classification. However, the PSTCNN is only utilized for tuning three hyperparameters (i.e., the coefficient that controls the decay rates of the past gradient, the square of the decay rates of the past gradient, and the learning rate).

Authors in [39] utilized deep learning with the self-adaptive Jaya algorithm (WE-SAJ) for Covid-19 CT image diagnosis. The proposed model first extracted the wavelet entropy features from the CT images, then utilized the self-adaptive Jaya algorithm for training the model. Finally, they employed a 2-layer feedforward neural network (FNN) as the classifier. Their proposed WE-SAJ model achieved more than 85% sensitivity. Although this model was well-designed, it requires hyperparameter tuning to increase the obtained results and achieve fast convergence.

The bottom line is that many evolutionary algorithms have been applied in medical applications to diagnose several diseases. The findings showed that applying these intelligent algorithms with FS as a preprocessing stage can enhance the classification results. Concerning COVID-19, several machine-learning algorithms have been utilized to detect COVID-19. However, the critical aspects of medical diagnosis pushed researchers to propose new methodologies and new enhancement strategies to optimize the random search.

According to the No-Free-Lunch theorem, there is still an area for proposing new algorithms to diagnose diseases. In this study, the recent SO algorithm is proposed for the first time for medical diagnosis. A binary version is produced to perform an FS within a wrapper framework. Furthermore, the crossover operators are proposed to enhance the search capability and generate more balance between the exploration and exploitation phases. The target is to enforce more diversity among solutions and assist entrapped solutions to jump from the local minima.

3. Snake optimizer (SO)

The SO algorithm is inspired by the behavior of snakes in nature [16]. The following points show the main SO steps: SO Initializes a set of random solutions in the search space using Eq. (1).

S n a k e_{i} = S n a k e_{m i n} + r a n d \times (S n a k e_{m a x} - S n a k e_{m i n})

(1)

where $S n a k e_{i}$ is the location in the search space of the $i_{t h}$ solution in the swarm. $r a n d$ is a random number $\in [0, 1]$ . $S n a k e_{m a x}$ and $S n a k e_{m i n}$ are the minimum and the maximum values respectively for the studied problem.

The population is divided into two parts (50% male and 50% female) using Eqs. (2), (3)

N u m_{m a l e} \approx N u m / 2

(2)

N u m_{f e m a l e} \approx N u m - N u m_{m a l e}

(3)

where $N u m$ is the size of the population (all snakes). $N u m_{m a l e}$ is the number of the male solutions. $N u m_{f e m a l e}$ is the number of female solutions.

Get the best solution from the male group ( $S n a k e b e s t_{m a l e}$ ), and female group ( $S n a k e b e s t_{f e m a l e}$ ) and find the location of the food $L_{f o o d}$ . Two other concepts are defined which are the temperature ( $T e m p e r a t u r e$ ) and the quantity of food ( $Q a n t i t y$ ) as in Eqs. (4), (5) respectively.

T e m p e r a t u r e = e^{(\frac{- C u r i t e r}{T o t i t e r})}

(4)

where $C u r i t e r$ is the current iteration and $T o t i t e r$ is the number of all iterations.

Q a n t i t y = C o n s t_{1} \times E x p (\frac{C u r i t e r - T o t i t e r}{T o t i t e r})

(5)

where $C o n s t_{1}$ is a constant value equal to 0.5.

Exploring the search space (food is not found): this depends on using a specified threshold value. If $Q u a n t i t y < 0.25$ , the solutions search globally by updating their locations concerning a specified random location in the search space. This modeled by Eqs. (6)– (9)

S n a k e m a l e_{i} (i t e r + 1) = S n a k e m a l e_{r a n d} (i t e r) \pm C o n s t_{2} \times A B m a l e \times ((S n a k e_{m a x} - S n a k e_{m i n}) \times r a n d + S n a k e_{m i n})

(6)

where $S n a k e m a l e_{i}$ is $i_{t h}$ male solution, $S n a k e m a l e_{r a n d}$ is the location of a random male solution, $r a n d$ is a random number $\in [0, 1]$ and $A B_{m a l e}$ is the ability of the male solution to find the food and can be computed using Eq. (7):

A B m a l e = E x p (- \frac{F i t n e s s m a l e_{r a n d}}{F i t n e s s m a l e_{i}})

(7)

where $F i t n e s s m a l e_{r a n d}$ is the fitness of $S n a k e m a l e_{r a n d}$ and $F i t n e s s m a l e_{i}$ is the fitness of $i_{t h}$ solution the in male group and $C o n s t_{2}$ is a constant equals 0.05.

S n a k e f e m a l e_{i} (i t e r + 1) = S n a k e f e m a l e_{r a n d} (i t e r) \pm C o n s t_{2} \times A B f e m a l e \times ((S n a k e_{m a x} - S n a k e_{m i n}) \times r a n d + S n a k e_{m i n})

(8)

where $S n a k e f e m a l e_{i}$ is $i_{t h}$ female solution, $S n a k e f e m a l e_{r a n d}$ is the location of a random female solution, $r a n d$ is a random number $\in [0, 1]$ and $A B_{f e m a l e}$ is the ability of the female solution to find the food and can be computed using Eq. (9):

A B f e m a l e = E x p (- \frac{F i t n e s s f e m a l e_{r a n d}}{F i t n e s s f e m a l e_{i}})

(9)

where $F i t n e s s f e m a l e_{r a n d}$ is the fitness of $S n a k e f e m a l e_{r a n d}$ and $F i t n e s s f e m a l e_{i}$ is the fitness of $i_{t h}$ solution the in male group and $C o n s t_{2}$ is a constant equals 0.05.

Exploiting the search space (Food is found) If the quantity of food is greater than a specified threshold $Q u a n t i t y > 0.25$ then the temperature is checked. If $T e m p e r a t u r e > 0.6$ (hot), The solutions will move to the food only.

S n a k e_{(i, j)} (i t e r + 1) = L_{f o o d} \pm C o n s t_{3} \times T e m p e r a t u r e \times r a n d \times (L_{f o o d} - S n a k e_{(i, j)} (i t e r))

(10)

where $S n a k e_{(i, j)}$ is the location of a solution (male or female), $L_{f o o d}$ is the location of the best solutions, and $C o n s t_{3}$ is the constant value and equals 2.

If $T e m p e r a t u r e > 0.6$ (cold), The snake will be in the fight mode or mating mode Fight Mode.

S n a k e m a l e_{i} (i t e r + 1) = S n a k e m a l e_{i} (i t e r) \pm C o n s t_{3} \times F A M \times r a n d \times (S n a k e f e m a l e_{b e s t} - S n a k e m a l e_{i} (i t e r))

(11)

where $S n a k e m a l e_{i}$ is the $i_{t h}$ male location, $S n a k e f e m a l e_{b e s t}$ is the location of the best solution in the female group, and FAM is the fighting ability of the male solution.

S n a k e f e m a l e_{i} (i t e r t + 1) = S n a k e f e m a l e_{i} (i t e r + 1) \pm C o n s t 3 \times F A F \times r a n d \times (S n a k e m a l e_{b e s t} - S n a k e f e m a l e_{i} (i t e r + 1))

(12)

where $S n a k e f e m a l e_{i}$ is the $i_{t h}$ female location, $S n a k e m a l e_{b e s t}$ is the location of the best solution in the male group, and FAF is the fighting ability of the female solution.

$F A M$ and $F A F$ can be computed from the following equations:

F A M = E x p (- \frac{F i t n e s s f e m a l e_{b e s t}}{F i t n e s s_{i}})

(13)

F A F = E x p (- \frac{F i t n e s s m a l e_{b e s t}}{F i t n e s s_{i}})

(14)

where $F i t n e s s f e m a l e_{b e s t}$ is the fitness of the best solution for the female group, $F i t n e s s m a l e_{b e s t}$ is the fitness of the best solution of male group and $F i t n e s s_{i}$ is the solution fitness.

Mating mode.

S n a k e m a l e_{i} (i t e r + 1) = S n a k e m a l e_{i} (i t e r) \pm C o n s t_{3} \times M A m \times r a n d \times (Q u a n t i t y \times S n a k e f e m a l e_{i} (i t e r) - S n a k e m a l e_{i} (i t e r))

(15)

S n a k e f e m a l e_{i} (i t e r + 1) = S n a k e f e m a l e_{i} (i t e r) \pm C o n s t_{3} \times M A f m \times r a n d \times (Q u a n t i t y \times S n a k e m a l e_{i} (i t e r) - S n a k e f e m a l e_{i} (i t e r))

(16)

where $S n a k e f e m a l e_{i}$ is the location of the $i_{t h}$ solution in female group and $S n a k e m a l e_{i}$ is the location of the $i_{t h}$ solution in male group and MAm and MAf are the ability of males and females for mating respectively and they can be computed as follow:

M A m = E x p (- \frac{F i t n e s s f e m a l e_{i}}{F i t n e s s m a l e_{i}})

(17)

M A f = E x p (- \frac{F i t n e s s m a l e_{i}}{F i t n e s s f e m a l e_{i}})

(18)

If the Egg hatch, select the worst male solution and the worst female solution and replace them

S n a k e m a l e_{w o r s t} = S n a k e_{m i n} + r a n d \times (S n a k e_{m a x} - S n a k e_{m i n})

(19)

S n a k e f e m a l e_{w o r s t} = S n a k e_{m i n} + r a n d \times (S n a k e_{m a x} - S n a k e_{m i n})

(20)

where $S n a k e m a l e_{w o r s t}$ is the worst solution in the male group, $S n a k e f e m a l e_{w o r s t}$ is the worst solution in the female group. The diversity factor operator $\pm$ gives chance to increase or decrease locations’ solutions to give a high probability to change the locations of solutions in the search space in all possible directions.

4. Proposed binary snake optimizer (BSO)

The FS problem deals with binary solutions that move in a discrete search space. The goal of the FS problem is to find the optimal subset of features. This feature subset represents the minimum number of features that have the maximum classification performance. In this study, the SO optimizer is converted for the first time into binary to tackle the FS problem. The original version of SO was developed to deal with continuous search space. Generating a binary version of SO requires representing a solution using a binary vector. The values of the solution’s elements are restricted to either ‘0’ or ‘1’. Concerning the update strategy in the algorithm, the solutions change their positions in the feature space. This requires using transfer functions to guarantee that the solution’s elements are either ‘0’ or ‘1’.

4.1. S-shaped transfer function

The used transfer function in this study is the sigmoid function Eq. (21). The main task of the sigmoid function is to generate a probability for each solution’s element. If this probability is greater than a specified threshold, then the value is ‘0’, otherwise, the value is ’‘1’, as presented in Eq. (22), where $X_{i}^{d} (t)$ , is the $i$ th $S n a k e$ at iteration $t$ in dimension $d$ . Algorithm 1 represents the pseudo-code of the binary version of the SO algorithm (BSO). Fig. 2 shows the flowchart of the BSO.

S = \frac{1}{1 + e^{- X_{i}^{d} (t)}}

(21)

X_{i}^{d} (t + 1) = \{\begin{matrix} 0 & i f r a n d < S (X_{i}^{d} (t)) \\ 1 & i f r a n d \geq S (X_{i}^{d} (t)) \end{matrix})

(22)

Fig. 2 — The flowchart of the proposed BSO for feature selection.

In the FS approach presented in this work, the TF displayed in Fig. 1 which represents Eq. (21) is used to represent the probability of changing the positions of the elements.

4.2. BSO for feature selection

To prepare the BSO for the FS problem, two main aspects should be considered: the solution representation and the fitness function. The FS problem requires initializing a solution using a binary vector. The length of this vector is the dimensionality of the problem. Hence, each bit of this vector represents a feature in a dataset. The values of the elements are either ‘0’ or ‘1’. ‘0’ means that the corresponding feature is not selected, while ‘1’ means the corresponding feature is selected. Fig. 3 shows the binary representation of the solutions.

Fig. 3 — The binary representation of the solutions for feature selection task.

Selecting all the features of a dataset, such as in brute force algorithms, causes the search algorithm’s running time to be exponential. The running time of the search algorithm is $2^{n}$ where $n$ is the number of features in a dataset. Hence, reducing the number of features will increase the efficiency of the search algorithm. For this reason, an FS algorithm is multiobjective because its main target is to find a solution with the minimum number of features and maximum classification performance. In this study, the K-nearest neighbor (K-NN) algorithm was used to train the dataset. The parameter $k$ is set to 5 [40]. The second aspect of the FS problem is the fitness function. The evaluation of the solutions depends on the number of selected features and the classification error rate as in Eq. (23), where $α γ_{R} (D)$ is the error rate of classification, $| S F |$ is the selected feature subset, $| A F |$ is the set of all features in a dataset.

F i t n e s s = α γ_{R} (D) + β \frac{| S F |}{| A F |}

(23)

4.3. Evolutionary crossover operators

The crossover operator is one of the primary evolutionary operators that have been widely used to enhance the swarm-based algorithm. Integrating the crossover operator in the structure of a swarm-based algorithm causes a greater exploration of the search space. This means that the solutions are re-positioned and distributed to undiscovered regions of the search space. Empowering the diversity of the algorithm assists the optimizer to alleviate the local minima problem and being close to the global best solution. This algorithm is abbreviated as BSO-CV. Eq. (24) shows the crossover function. Each solution $\vec{S_{i}}$ in the search space is linked with the position of one of the fittest solutions in the swarm. The roulette wheel selection operator is used to find out the fittest solution $\vec{S_{w}}$ .

\vec{S_{i}} (i t e r + 1) = ⋈ \vec{S_{i}} (i t e r) \vec{S_{w}} (i t e r)

(24)

Three types of crossover operators are used and integrated with the SO algorithm. The roulette wheel selection operator is used to select among these types randomly in each run of the SO.

•
One-point crossover: selects randomly a single point from the current solution. Based on the selected point, the next elements to them are exchanged with each other. The crossover is applied on the current solution $\vec{S_{i}}$ and the best solution $\vec{S_{w}}$ . The probability of the occurrence of a single-point crossover $r \in [0, . 33]$ where $r$ is a random number
•
Two-point crossover: selects randomly two points in the current solution. The elements within the two points are interchanged. This happens between $\vec{S_{i}}$ and $\vec{S_{w}}$ . The probability of the occurrence of the two-point crossover $r \in (0.33, 0.67]$ .
•
Uniform crossover: The current solution elements $\vec{S_{i}}$ and the best solution elements $\vec{S_{w}}$ are shuffled based on a pre-determined ratio. For example, if the ratio is 30%, this means that every 30% of the number of elements in the solution must be exchanged between the two solutions. The probability of the occurrence of the uniform crossover $r \in (0.67, 1]$ .

Fig. 4 shows the techniques followed by different types of crossover operators. Algorithm 3 shows the Greedy crossover operator pseudocode.

4.4. Complexity analysis of BSO and BSO-CV

Time complexity of the BSO and BSO-CV algorithms was recruited with the use of the Big-O notation (i.e., the worst case). Particularly, the time complexity analysis of these methods for feature selection tasks is based basically on the initialization process, the dataset dimensions ( $d$ ), the cost of the fitness function ( $C$ ), the number of iterations for the optimization algorithm ( $K$ ), population size n (i.e., the number of male + female populations), and the number of running experiments ( $V$ ). In addition, the S-shaped transfer function is used to produce binary versions of the BSO and BSO-CV. Based on the above notations, the general computational complexity of the BSO and BSO-CV can be formulated using the Big-O case as follows:

O (B S O) = O (i n i t .) + O (K (p o p . u p d a t e))

(25)

+ O (K (f i t n e s s e v a l .)) + O (K (s e l e c t i o n))

By calculating the Big-O for each phase in Eq. (26), the time complexity for BSO can be represented as the following:

O (B S O) = O (n f + n m) d + O (V K (n f + n m) d)

(26)

+ O (V K (n f + n m) c) + O (V K (n f + n m) d)

O (B S O - C V) = O (n f + n m) d + O (V K (n f + n m) d)

(27)

+ O (2 V K (n f + n m) c + V K (n f + n m) d) + O (V K (n f + n m) d)

As shown in Eq. (28) the main parameters of the complexity issue rely on the number of iterations as well as the size of the population. Besides, $(n f + n M) d ≪ V K (n f + n M) d$ and $(n f + n M) d ≪ V K c (n f + n M)$ , so the component $(n f + n M) d$ can be ruled out from the time complexity given in Eq. (28). Thus, the time complexity of the BSO can be viewed as the following:

O (B S O) ≅ (V K (n f + n M) d + V K (n f + n m) c)

(28)

for the BSO-CV, the time complexity is the same as the BSO expect the CV is added into each iteration. thus, the time complexity for BSO-CV is ginven in the following:

5. Experimental results and discussion

5.1. Experiment settings and parameters setup

Some preliminary experiments were carried out to determine the input parameters that enabled the proposed method to produce a better output. To apply fairness, the algorithm configurations were identical throughout the experiments. The used classifier in the BSO wrapper framework is the K-nearest neighbor (KNN). The KNN receives each unclassified new data instance in the feature space as an input, then uses the similarity method to classify it and put it in a particular category. This labeling method is a kind of supervised learning that is commonly used in the diagnosis of disease. In this study, $K = 5$ is used to make voting and the decision of the class membership is based on the majority of votes. The parameters setting is, the number of runs is 30, the number of iterations is 100, and the size of the population is 100.

5.2. Evaluation measures

The proposed BSO and BSO-CV are evaluated using accuracy, the number of selected features, running time, sensitivity and specificity, convergence curves, boxplots, and T-test. The following are descriptions of the accuracy, sensitivity, and specificity measures along with their formulas and meaning relative to disease diagnosis. Eq. (29), Eq. (30), and Eq. (31) and shows the mathematical formulas of the classification accuracy, sensitivity specificity, and precision respectively.

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(29)

where:

•
True positives (TPs): indicates the instances that are actually sick (have a disease) and the model diagnoses them as sick. and the actual output was also true.
•
True negatives (TNs): indicates the instances that are actually well (does not have a disease) and the model diagnoses them as well.
•
False positives (FPs): indicates the instances that are actually well and the model diagnoses them as sick.
•
False negatives (FNs): indicates the instances that are actually sick and the model diagnoses them as well.
$S e n s i t i v i t y = \frac{T P}{T P + F N}$ (30)

$S p e c i f i c i t y = \frac{T N}{T N + F P}$ (31)

5.3. Description of the benchmark datasets

Table 1 shows the datasets used in this study. 23 medical benchmark datasets are used in the experiments. In addition, a real COVID-19 dataset is used. For the benchmark datasets, twelve of them were downloaded from the UCI (Diagnostic, Original, Prognostic, Coimbra, BreastEW, Retinopathy, Dermatology, ILPD-Liver, Lymphography, Parkinsons, ParkinsonC, and Prostate). Seven datasets were downloaded from KEEL (SPECT, Cleveland, HeartEW, Hepatitis, SAHear, Spectfheart, and Thyroid0387). Two datasets (Heart and Pima-diabetes) were downloaded from Kaggle. The remaining three datasets were downloaded from different feature selection websites (Leukemia from https://jundongl.github.io/scikit-feature/datasets.html), (Colon from https://jundongl.github.io/scikit-feature/datasets.html) and (Prostate_GE from https://jundongl.github.io/scikit-feature/datasets.html)

Table 1.

Medical benchmark datasets.

Number	Dataset	Number of features	Number of instances	Number of classes
1	Diagnostic	30	569	2
2	Original	9	699	2
3	Prognostic	33	194	2
4	Coimbra	9	115	2
5	BreastEW	30	596	2
6	Retinopathy	19	1151	2
7	Dermatology	34	366	6
8	ILPD-Liver	10	583	2
9	Lymphography	18	148	4
10	Parkinsons	22	194	2
11	ParkinsonC	753	755	2
12	SPECT	22	267	2
13	Cleveland	13	297	5
14	HeartEW	13	270	2
15	Hepatitis	18	79	2
16	SAHeart	9	461	2
17	Spectfheart	43	266	2
18	Thyroid0387	21	7200	3
19	Heart	13	302	5
20	Pima-diabetes	9	768	2
21	Leukemia	7129	72	2
22	Colon	2000	62	2
23	Prostate_GE	5966	102	2

No	Feature name	Description
1	Id	Patient identifier
2	Location	Patient location (local address)
3	Country	Country of origin of the patient
4	Gender	Gender of the patient
5	Age	Age of the patient
6	Sym_on	Date the patient shows symptoms
7	Hosp_vis	Date the patient visits hospital
8	vis_wuhan	The patient has visited Wuhan
9	From_wuhan	The patient is from Wuhan
10	Symptom1	A symptom presented by the patient
11	Symptom2	A symptom presented by the patient
12	symptom3	A symptom presented by the patient
13	Symptom4	A symptom presented by the patient
14	Symptom5	A symptom presented by the patient
15	Symptom6	A symptom presented by the patient

Benchmark	Stat. measure	Accuracy	Sensitivity	Specificity	Time
Diagnostic	AVE STD	0.7499 0.0425	0.4796 0.0991	0.9098 0.0356	28.9987 3.9533
Original	AVE STD	0.9691 0.0087	0.9699 0.0133	0.9698 0.0256	54.5091 2.5939
Prognostic	AVE STD	0.7184 0.0587	0.9193 0.0754	0.6956 0.0889	75.9876 4.9834
Coimbra	AVE STD	0.5145 0.0976	0.4239 0.1815	0.6137 0.1573	29.8974 39.9873
Retinopathy	AVE STD	0.6523 0.0303	0.6742 0.0517	0.6333 0.0407	149.9875 29.7789
Dermatology	AVE STD	0.6934 0.0423	0.8332 0.0459	0.9211 0.0772	30.9987 2.3456
ILPD-Liver	AVE STD	0.7728 0.0444	0.8060 0.9281	0.2066 0.0735	55.1002 2.7344
Lymphography	AVE STD	0.9043 0.0739	0.4162 0.5348	0.7000 0.0948	59.9567 49.4922
Parkinsons	AVE STD	0.8471 0.0492	0.5703 0.1452	0.9410 0.0546	72.5678 0.9988
ParkinsonC	AVE STD	0.7307 0.0295	0.2707 0.0595	0.8938 0.0313	77.6543 5.9866
SPECT	AVE STD	0.6717 0.0448	0.7249 0.0625	0.5971 0.0847	96.5789 25.5431
Cleveland	AVE STD	0.4712 0.0569	0.1969 0.0291	0.8142 0.0128	15.8569 5.2949
HeartEW	AVE STD	0.8284 0.0412	0.8481 0.0635	0.8064 0.0682	56.9973 34.7790
Hepatitis	AVE STD	0.8778 0.0759	0.0789 0.2138	0.9533 0.0491	45.7563 2.7331
SAHeart	AVE STD	0.6239 0.0516	0.7774 0.0477	0.3216 0.1003	97.4367 5.7791
Spectfheart	AVE STD	0.7692 0.0505	0.3404 0.1621	0.8869 0.0438	104.8872 3.4456
Thyroid0387	AVE STD	0.9382 0.0065	0.5463 0.0329	0.7536 0.0129	155.9893 7.8893
Heart	AVE STD	0.8567 0.5679	0.8087 0.0994	0.6432 0.3451	29.6388 2.4167
Pima-diabetes	AVE STD	0.7107 0.0287	0.8021 0.0347	0.5345 0.0554	69.8896 5.7689
Leukemia	AVE STD	0.8714 0.0868	0.9773 0.0495	0.6872 0.2041	67.9987 5.7654
Prostate_GE	AVE STD	0.8711 0.0535	0.8813 0.0941	0.8594 0.0763	250.6578 9.6678
BreastEW	AVE STD	0.9596 0.0154	0.9824 0.0158	0.9212 0.0404	7.8890 0.6789
Colon	AVE STD	0.7528 0.1285	0.8701 0.1508	0.6002 0.2365	9.9865 0.8976

Results of BSO ON Covid-19 dataset

Evaluation measure	Average	Standard deviation	Minimum	Maximum
Accuracy	0.9378	0.0136	0.9167	0.9500
# Selected features	3.1000	2.5582	2.5000	3.5000
Fitness value	0.1371	0.0054	0.1304	0.1390
Running time	21.6646	0.0412	21.6157	21.7439
Sensitivity	0.2706	0.0610	0.1765	0.3571
Specificity	0.9631	0.0197	0.9314	1.0000

Results of BSO-CV ON Covid-19 dataset

Evaluation measure	Average	Standard deviation	Minimum	Maximum
Accuracy	0.9560	0.0123	0.9167	0.9661
#Selected features	1.7000	2.4967	1.5000	2.4400
Fitness value	0.1351	0.0039	0.1299	0.1366
Running time	4.4075	0.1031	4.2226	4.6160
Sensitivity	0.2973	0.1132	0.1111	0.5000
Specificity	0.9664	0.0231	0.9118	0.9905

Benchmark	Average		Standard deviation		Minimum		Maximum
	BSO	BSO-CV	BSO	BSO-CV	BSO	BSO-CV	BSO	BSO-CV
Diagnostic	0.9912	0.9930	0.0091	0.0124	0.9549	0.9725	0.9800	1.0000
Original	0.9886	0.9886	0.0060	0.0060	0.9357	0.9557	0.9700	1.0000
Prognostic	0.8368	0.8474	0.0166	0.0299	0.7995	0.8421	0.8733	0.8947
Coimbra	0.8182	0.8182	0.0000	0.0000	0.7582	0.7782	0.7910	0.8182
Retinopathy	0.7348	0.7391	0.0074	0.0100	0.6917	0.7004	0.7210	0.7478
Dermatology	1.0000	1.0000	0.0000	0.0000	0.9290	0.9400	0.9600	1.0000
ILPD-Liver	0.7949	0.7966	0.0000	0.0054	0.7397	0.7466	0.7711	0.7966
Lymphography	0.9452	0.9729	0.0628	0.0351	0.8571	0.9286	0.95550	1.0000
Parkinsons	0.9845	0.9850	0.0250	0.0242	0.9474	0.9500	0.9676	1.0000
ParkinsonC	0.7833	0.7873	0.0158	0.0131	0.7332	0.7632	0.7900	0.8133
SPECT	0.8047	0.8180	0.0479	0.0266	0.7989	0.8889	0.8419	0.8889
Cleveland	0.6817	0.7041	0.0390	0.0145	0.6097	0.7737	0.7022	0.7241
HeartEW	0.9148	0.9370	0.0250	0.0250	0.6800	0.7078	0.9411	0.9630
Hepatitis	0.9750	0.9875	0.0530	0.0395	0.8362	0.9350	0.98100	1.0000
SAHeart	0.7478	0.7500	0.0112	0.0185	0.7391	0.8730	0.7509	0.7826
Spectfheart	0.9051	0.9084	0.0406	0.0267	0.7091	0.8846	0.9445	0.9615
Thyroid0387	0.9894	0.9881	0.0032	0.0032	0.9145	0.9400	0.9831	0.9944
Heart	0.9074	0.9259	0.0000	0.0195	0.7082	0.7682	0.9033	0.9259
Pima-diabetes	0.8182	0.8182	0.0000	0.0000	0.8344	0.8959	0.7982	0.8182
Leukemia	1.0000	1.0000	0.0000	0.0000	0.9066	0.9321	0.9870	1.0000
Prostate_GE	0.9900	1.0000	0.0316	0.0000	0.8571	0.8571	0.9822	1.0000
BreastEW	0.9895	0.9912	0.0092	0.0123	0.8093	0.9080	0.9810	1.0000
Colon	0.9571	0.9571	0.0690	0.0690	0.9000	0.9049	0.9744	1.0000

Benchmark	Average accuracy			Average selection size
	BSO-CV	CHIO-GC	LBMFO-V3	BSO-CV	CHIO-GC	LBMFO-V3
Diagnostic	0.9930	0.9033	0.9100	14.4000	13.3700	13.9991
Original	0.9886	0.9710	0.9683	3.3000	5.1040	5.5000
Prognostic	0.8474	0.6716	0.9312	15.5000	14.6202	3.5103
Coimbra	0.8182	0.8896	0.9312	6.1000	3.6007	3.5103
Retinopathy	0.7391	0.6436	0.5380	11.2000	7.2647	6.9002
Dermatology	1.0000	0.8006	0.8442	17.3000	18.4900	18.3541
ILPD-Liver	0.7966	0.7716	0.7143	4.0000	4.0000	4.0000
Lymphography	0.9729	0.8343	0.8002	9.4000	10.0622	9.7520
Parkinsons	0.9850	0.7903	0.7689	9.1000	9.7383	10.3584
ParkinsonC	0.7873	0.8400	0.8190	461.4000	365.8322	369.1070
SPECT	0.8180	0.6960	0.6576	12.5000	9.6050	10.7832
Cleveland	0.7041	0.5966	0.5333	6.4000	6.8097	6.6899
HeartEW	0.9370	0.9116	0.9388	7.3000	7.0105	6.3100
Hepatitis	0.9875	0.7903	0.7500	5.9000	8.2011	8.3569
SAHeart	0.7500	0.7036	0.6992	3.1000	3.1551	3.2222
Spectfheart	0.9084	0.7303	0.7013	22.7000	21.0030	20.4598
Thyroid0387	0.9881	0.9603	0.9776	10.0000	8.0116	8.4563
Heart	0.9259	0.8126	0.7603	5.2000	6.1505	6.2752
Pima-diabetes	0.8182	0.7956	0.8065	4.0000	6.8387	6.7612
Leukemia	1.0000	0.9900	1.0000	3502.8000	3560.5107	3570.7137
Prostate_GE	1.0000	0.6010	0.5056	2969.8000	2979.4116	2984.7153
BreastEW	0.9912	0.9400	0.9398	16.0000	13.7303	13.9714
Colon	0.9571	0.7176	0.6667	970.3000	1000.0067	991.5551

Benchmark	BSO-CV	Chi-square	Relief	CFS	IG
Diagnostic	0.9930	0.5714	0.9585	0.9533	0.9349
Original	0.9886	0.9091	0.6426	0.6860	0.6759
Prognostic	0.8447	0.5910	0.7727	0.7576	0.7577
Coimbra	0.8182	0.3846	0.6672	0.5763	0.5578
Retinopathy	0.7391	0.6349	0.5036	0.4783	0.5393
Dermatology	1.0000	0.7250	0.7248	0.4732	0.4021
ILPD-Liver	0.7949	0.7106	0.5119	0.5223	0.5264
Lymphography	0.9729	0.8824	0.5886	0.5533	0.5204
Parkinsons	0.9850	0.7581	0.7588	0.7360	0.7150
ParkinsonC	0.7873	0.6593	0.6590	0.6487	0.6376
SPECT	0.8180	0.9667	0.5651	0.5508	0.5460
Cleveland	0.7041	0.3940	0.1181	0.0398	0.0826
HeartEW	0.9370	0.9334	0.6153	0.5757	0.6202
Hepatitis	0.9875	0.7778	0.5538	0.5857	0.6417
SAHeart	0.7500	0.6471	0.5024	0.5115	0.5227
SPECTfheart	0.9084	0.7000	0.6079	0.6279	0.5551
Thyroid0387	0.9881	1.0000	0.6379	0.6955	0.9773
Heart	0.9259	0.5333	0.6317	0.5575	0.6114
Pima-diabetes	0.8182	0.6905	0.5147	0.5426	0.5264
Leukemia	1.0000	0.7120	0.6883	0.6759	0.6410
Prostate_GE	1.0000	0.5042	0.5033	0.4786	0.4421
BreastEW	0.9912	0.9365	0.8160	0.8029	0.8128
Colon	0.9571	0.5850	0.5641	0.5116	0.5097

Benchmark	BSO-CV	Chi-square	Relief	CFS	IG
Diagnostic	13.0000	18.0000	17.5000	16.0000	15.0000
Original	3.2000	4.0000	4.0000	4.0000	4.0000
Prognostic	14.2000	20.0000	18.0000	19.0000	21.0000
Coimbra	6.0000	7.0000	7.0000	7.0000	7.0000
Retinopathy	10.5000	14.0000	13.0000	11.0000	12.0000
Dermatology	17.3000	19.0000	15.0000	14.0000	16.0000
ILPD-Liver	3.6000	5.0000	4.0000	4.0000	4.0000
Lymphography	9.1000	16.0000	27.0000	12.0000	15.0000
Parkinsons	8.5000	13.0000	10.0000	9.0000	11.0000
ParkinsonC	466.1000	495.0000	496.0000	460.0000	461.0000
SPECT	12.6000	15.0000	13.0000	14.0000	14.0000
Cleveland	6.5000	7.0000	8.0000	7.0000	7.0000
HeartEW	6.8000	9.0000	8.0000	7.0000	9.0000
Hepatitis	6.9000	9.0000	8.0000	7.0000	9.0000
SAHeart	2.6000	4.0000	3.0000	3.0000	4.0000
SPECTfheart	25.0000	30.0000	27.0000	26.0000	29.0000
Thyroid0387	9.5000	10.0000	12.0000	11.0000	10.0000
Heart	4.2000	6.0000	5.0000	6.0000	6.0000
Pima-diabetes	4.000	4.0000	4.0000	4.0000	4.0000
Leukemia	3498.1000	3500.0000	3530.0000	3533.0000	3540.0000
Prostate_GE	2982.5000	2972.0000	2540.5000	2533.0000	2966.5000
BreastEW	16.0000	19.0000	17.0000	20.0000	18.0000
Colon	970.3000	990.0000	985.0000	977.0000	980.0000

i	Algorithm	$z = (R_{0} - R^{i}) / S E$	$p$ -value	$α \div i$	Hypothesis
2	LBMFO-V3	3.980932	6.864535E−05	0.025000	Rejected
1	CHIO-GC	3.317444	9.084512E−04	0.050000	Rejected

Method	Rank
BSO-CV	1.130434
Chi-square	2.478260
Relief	3.521739
CFS	3.782608
IG	4.086956

i	Method	$z = (R_{0} - R^{i}) / S E$	$p$ -value	$α \div i$	Hypothesis
4	IG	6.341032	2.2823009 E-10	0.012500	Rejected
3	CFS	5.688279	1.283257E−08	0.016666	Rejected
2	Relief	5.128776	2.916314E−07	0.025000	Rejected
1	Chi-square	2.890764	0.003843	0.050000	Rejected

PERMALINK

An augmented Snake Optimizer for diseases and COVID-19 diagnosis

Ruba Abu Khurma

Dheeb Albashish

Malik Braik

Abdullah Alzaqebah

Ashwaq Qasem

Omar Adwan

Abstract

1. Introduction

2. Literature review

2.1. Evolutionary feature selection for disease diagnosis

2.2. Machine learning techniques for tackling COVID-19: Background

3. Snake optimizer (SO)

4. Proposed binary snake optimizer (BSO)

4.1. S-shaped transfer function

Fig. 2.

Fig. 1.

4.2. BSO for feature selection

Fig. 3.

4.3. Evolutionary crossover operators

Fig. 4.

4.4. Complexity analysis of BSO and BSO-CV

5. Experimental results and discussion

5.1. Experiment settings and parameters setup

5.2. Evaluation measures

5.3. Description of the benchmark datasets

Table 1.

5.4. A real world COVID-19 dataset

Table 2.

5.5. Results and discussion

Table 3.

Table 10.

Table 4.

Table 5.

Table 6.

Table 7.

Fig. 6.

Fig. 5.

Fig. 7.

Table 8.

Table 9.

5.6. Comparison with other meta-heuristic algorithms in the literature

5.6.1. Comparison with CHIO-GC

Table 11.

Fig. 8.

Fig. 9.

5.6.2. Comparison with LBMFO-V3

Fig. 10.

5.6.3. Comparison with HLBDA

Fig. 11.

5.7. Comparison with filter-based methods

Table 12.

Fig. 12.

Table 13.

6. Computational statistical test analysis

6.1. A statistical test of BSO compared to other wrapper FS methods

Table 14.

Table 15.

6.2. A statistical test of BSO-CV compared to filter-based FS methods

Table 16.

Table 17.

7. Conclusion and future works

CRediT authorship contribution statement

Declaration of Competing Interest

Data availability

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases