ST-AL: a hybridized search based metaheuristic computational algorithm towards optimization of high dimensional industrial datasets

Reham R Mostafa; Noha E El-Attar; Sahar F Sabbeh; Ankit Vidyarthi; Fatma A Hashim

doi:10.1007/s00500-022-07115-7

. 2022 May 9:1–29. Online ahead of print. doi: 10.1007/s00500-022-07115-7

ST-AL: a hybridized search based metaheuristic computational algorithm towards optimization of high dimensional industrial datasets

Reham R Mostafa ^1,^✉, Noha E El-Attar ², Sahar F Sabbeh ^2,³, Ankit Vidyarthi ^4,^✉, Fatma A Hashim ⁵

PMCID: PMC9081968 PMID: 35574265

Abstract

The rapid growth of data generated by several applications like engineering, biotechnology, energy, and others has become a crucial challenge in the high dimensional data mining. The large amounts of data, especially those with high dimensions, may contain many irrelevant, redundant, or noisy features, which may negatively affect the accuracy and efficiency of the industrial data mining process. Recently, several meta-heuristic optimization algorithms have been utilized to evolve feature selection techniques for dealing with the vast dimensionality problem. Despite optimization algorithms’ ability to find the near-optimal feature subset of the search space, they still face some global optimization challenges. This paper proposes an improved version of the sooty tern optimization (ST) algorithm, namely the ST-AL method, to improve the search performance for high-dimensional industrial optimization problems. ST-AL method is developed by boosting the performance of STOA by applying four strategies. The first strategy is the use of a control randomization parameters that ensure the balance between the exploration–exploitation stages during the search process; moreover, it avoids falling into local optimums. The second strategy entails the creation of a new exploration phase based on the Ant lion (AL) algorithm. The third strategy is improving the STOA exploitation phase by modifying the main equation of position updating. Finally, the greedy selection is used to ignore the poor generated population and keeps it from diverging from the existing promising regions. To evaluate the performance of the proposed ST-AL algorithm, it has been employed as a global optimization method to discover the optimal value of ten CEC2020 benchmark functions. Also, it has been applied as a feature selection approach on 16 benchmark datasets in the UCI repository and compared with seven well-known optimization feature selection methods. The experimental results reveal the superiority of the proposed algorithm in avoiding local minima and increasing the convergence rate. The experimental result are compared with state-of-the-art algorithms, i.e., ALO, STOA, PSO, GWO, HHO, MFO, and MPA and found that the mean accuracy achieved is in range 0.94–1.00.

Keywords: Sooty tern optimization, Ant lion optimization, Feature optimization, Metaheuristic algorithm, High dimensional search space

Introduction

In the past decades, optimization issues have attracted extensive attention in several fields, to name a few: computer science, engineering, operational research, energy, and business (Oliva and Elaziz 2020). In general, optimization techniques aim to identify the best solutions from a set of available alternatives in the problem search space. Optimization problems can be categorized into binary or continuous, static or dynamic, single-objective or multi-objective, and constrained or unconstrained (Hussien and Amin 2021). In sophisticated optimization problems, it is imperative to investigate the search space adequately based on the problem type (Anand and Arora 2020). Consequently, due to the growing complexity in optimization problems and the variety in their types, the conventional mathematical techniques (e.g., Newton and gradient descent) have become worthless due to their substantial time-consuming and probability of falling in local optima problem (Hussien and Amin 2021).

Meta-heuristic techniques have been successfully developed to handle a lot of tough optimization problems effectively. They have the ability to exploit significant information from the search space and determine the optimal solution rapidly and efficiently (Anand and Arora 2020). Almost all meta-heuristic algorithms have been inspired by nature, like the behavior of animals, birds, insects, and even humans (Hussien and Amin 2021). Genetic algorithm (GA) (Goldberg and Holland 1988), particle swarm optimization (PSO) (Eberhart and Kennedy 1995), differential evolution (DE) (Storn and Price 1997), firefly algorithm (Yang 2010), flower pollination algorithm (FPA) (Yang 2012), artificial bee colony (ABC) (Karaboga and Basturk 2007), and grey wolf optimization algorithm (GWO) (Mirjalili et al. 2014) are examples of the original and prominent meta-heuristic algorithms. Recently, there are several nature-inspired meta-heuristic techniques have been innovated, to name a few, Grasshopper optimization algorithm (GOA) (Mirjalili et al. 2018), selfish herd optimizer (SHO) (Fausto et al. 2017), honey badger algorithm (HBA) (Hashim et al. 2022), butterfly optimization algorithm (BOA) (Arora and Singh 2019), Sine Cosine Algorithm (SCA) (Mirjalili 2016), Salp Swarm Algorithm (SSA) (Mirjalili et al. 2017), and Snake Optimizer (SO) (Hashim and Hussien 2022).

Primarily, the meta-heuristic algorithm contains two fundamental stages: exploration and exploitation. The exploration phase is commonly based on randomization methods used to search effectively in the search space. At the same time, the exploitation phase concerns finding the most promising region of the search space. On the other hand, working on knowledge discovery over high-dimensional datasets is crucial. It needs to prepare the data through a pre-processing data stage (Anand and Arora 2020). This pre-processing step is used mainly to reduce the dimensionality of high dimensional data by neglecting and stripping the irrelevant, redundant, missing, and noisy features from the data set (Sayed et al. 2018). In general, the feature selection process is considered a vital data pre-processing method for coping with the dimensionality curse. Feature selection strategies aim to pick a subset of features based on a set of criteria while maintaining the physical meanings of the original features (Huang et al. 2020). The feature selection process can boost learning model comprehension and perception by reducing the search space size to increase learning efficiency (i.e., training time and classifier complexity are reduced, and prediction performance or classification accuracy is improved) (Zhang et al. 2014).

Commonly, feature selection approaches are divided into three categories based on the methods used to evaluate feature subsets: filter, wrapper, and embedding methods (Neggaz et al. 2020). The intrinsic properties of the data are used to select features for a filter method (Teng et al. 2017). Filter methods are called classifier-independent since they evaluate important information for classification regardless of the machine learning technique (Rani and Rajalaxmi 2015). Filter approaches are quick since they don’t use a learning algorithm to analyze attributes, but they don’t provide enough information to categorize samples. The Fast Correlation-based Filter (FCBF) and the minimal-redundancy-maximal-relevance (mRMR) are two filter types. Wrapper and embedded models, on the other hand, are dependent on the classifier. The wrapper model investigates the space of potential solutions using a machine learning technique (Emary et al. 2016). To evaluate the selected subset, the validation accuracy of a certain classifier is used. Embedded-based approaches discover, as the classification model is being built, which features have the greatest impact on its accuracy. A wrapper method typically outperforms a filter method since the proposed subset of features is evaluated for accuracy using feedback from the learning algorithm. However, computationally, they are more expensive, and in terms of performance, they depend on the applied learning method.

Accordingly, the most critical aspect of the feature selection algorithm is searching for an optimal or nearly optimal subset of features that increase the classifier’s accuracy and reduce the computational complexity. Exhaustive search methods like breadth and depth searches are considered infeasible for discovering a subset of features, especially in massive datasets. A dataset containing M features requires the production of 2M feature subsets. The quality of these feature subsets needs to be evaluated (Zhang et al. 2014), which is computationally intensive, especially in wrapper-based approaches, where the learning algorithm must be implemented for each subset. The best way is to treat feature selection as an NP-hard optimization problem. The objective function minimizes the number of selected features while preserving the highest classification accuracy. This means that feature selection problems could benefit from metaheuristics, which have shown extraordinary performance in tackling various optimization problems (Motoda and Liu 2002). Metaheuristic algorithms have the ability to address complex optimization problems because of their dynamic search behaviors and global search capability. Indeed, several meta-heuristic algorithms have been utilized to improve the performance of feature selection process, to name a few, genetic algorithms (Oh et al. 2004), particle swarm optimization (Gu et al. 2018), ant colony optimization (ACO) algorithm (Aghdam et al. 2009), artificial bee colony (ABC) algorithm (Uzer et al. 2013), binary gravitational search algorithm (BGSA) (Papa et al. 2011), scatter search algorithm (SSA) (Wang et al. 2012), archimedes optimization algorithm (AOA) (Desuky et al. 2021), backtracking search algorithm (BSA) (Ghanem and Layeb 2021), and moth-flame optimization (MFO) algorithm (Soliman et al. 2018).

Most of the originally introduced optimization techniques often suffer from some performance shortcomings, especially when implemented in large-scale datasets. These shortcomings are due to the imbalance between the exploration and exploitation stages, leading to falling into local optima or not converging properly. In this case, most of the feature selection literature has recently tended to modify existing metaheuristics algorithms to improve their performance or hybridize between different metaheuristics algorithms to take advantage of one technique to improve the search efficiency of the other. For instance, the hybridization between Harris hawks optimization (HHO) algorithm with simulated annealing (SA) (Abdel-Basset et al. 2021), arithmetic optimization algorithm (AOA) with genetic algorithm (GA) (Ewees et al. 2021), salp swarm algorithm (SSA) with sine cosine algorithm (SCA) (Neggaz et al. 2020), and the combination of seagull optimization algorithm (SOA) and Lévy flight and mutation operator (Ewees et al. 2022).

However, these methodologies have some restrictions that impact the ultimate solution’s quality. Based on the No Free Lunch Theorem (NFL) (Wolpert and Macready 1997), it is concluded that no algorithm is better than all others with all classes of feature selection problems. Therefore, a new algorithm or an improved version of an existing one must be devised to deal with feature selection challenges more effectively. This is the primary motivation for us to propose a new feature selection approach based on enhancing the performance of a novel metaheuristic algorithm, known as the Sooty Tern Optimization Algorithm (STOA) Dhiman and Kaur (2019). This improvement is made by using the Ant lion optimization (ALO) (Mirjalili 2015a) algorithm to enhance the exploration of STOA due to ALO’s capacity to locate the feasible regions that contain the optimal solution.

The STOA algorithm is a new population-based metaheuristic algorithm developed by Dhiman and Kaur, through simulating the migration and attacking behaviors of sea bird sooty tern in nature (Dhiman and Kaur 2019). It has gotten a lot of attention in the last few decades and has been used in a variety of applications (Ali et al. 2021; Zheng et al. 2021; Kader and Zamli 2022). Despite eminent applications, STOA is still needed more improvement to overcome its limitations. For example, the STOA exploration phase is based on the best solution only which prevents it to explore the search space properly in order to find the prominent region that contains the optimal solution. On the other hand, ALO is popular metaheuristic algorithm proposed by Mirjalili (2015a), and it is inspired by the hunting mechanism of antlions. It is characterized by good exploration and exploitation phases, avoidance of falling into the local optimum level, and rapid convergence of the optimal solution.

In this study, a novel hybridization technique was proposed based on boosting the performance of STOA through the use of the ALO algorithm. This hybridization is called the ST-AL method. The performance of the proposed ST-AL method was assessed using two experiments; (1) solving global optimization problems and (2) solving feature selection challenges. The main contributions of this paper can be summarized as follows:

Developed a novel hybrid method based on Sooty Tern Optimization Algorithm (ST) and Ant Lion Optimization (AL). The proposed method is called ST-AL.
Tested ST-AL on CEC’2020 test suite.
Employed ST-AL as a wrapper feature selection algorithm for large and small benchmark datasets
Comparing the performance of ST-AL with established swarm intelligence algorithms such as PSO, GWO, HHO, MFO, MPA and conventional ST and AL algorithms
Demonstrated the effectiveness and superiority of the proposed ST-AL in both global optimization and feature selection problems.

The rest of the paper is organized as follows: Sect. 2 presented the detailed overview on the related work. To understand the methodology, a preliminary study about the algorithms is presented in Sect. 3. The detailed overview on the proposed methodology is presented in Sect. 4. The performance evaluation of the proposed algorithm is given in Sect. 5. At the last, the work is concluded with future scope in Sect. 6.

Related works

Recently, meta-heuristic algorithms have attracted attention as an efficient technique to find the optimal solutions and enhance the feature selection process, especially with the massive increase of the data volume and in the level of its complexity. To enhance the optimization process, several studies have developed robust current meta-heuristic optimization algorithms to overcome the local optima problem in the ample solutions space. For instance, some researchers have used chaotic search to enhance the search process and solve local optima problems and low convergence rates, such as Arora et al. (2020). In this study, the authors have presented a novel Chaotic Interior Search Algorithm (CISA) based on integrating the Interior Search Algorithm and the chaos theory to solve the entrapment of both local optima and slow convergence speed. To evaluate the proposed algorithm, it has been tested on 13 global benchmark functions. Also, Sayed et al. have adopted chaos theory to enhance the performance of the Salp Swarm Algorithm (SSA) and proposed Chaotic Salp Swarm Algorithm. This paper has employed ten different chaotic maps to improve the convergence rate and resulting accuracy (Sayed et al. 2018). Chaotic search has also boosted the search process of selfish herd optimizers (SHO) in Anand and Arora (2020). Anand and Arora have proposed a Chaotic Selfish Herd Optimizer (CSHO) algorithm with various chaotic maps to substitute the value of each searching agent’s survival parameter, which helped in controlling both exploration and exploitation processes. Likewise, in Oliva and Elaziz (2020) have applied chaotic maps and opposition-based learning (OBL) to enhance the Brainstorm optimization algorithm (BSO) performance. The proposed algorithm was called opposition chaotic BSO with disruption (OCBSOD). The idea of this algorithm can be summarized in the following steps: first, the chaotic map was applied to compute the initial solutions; after that, the opposition-based learning produced the opposite positions in the search space, then, the best particles were identified and applied in the iterative process. The role of the disruption operator was to update the position of the instance in the population. Finally, the OBL was applied to enhance the exploration process of the search domain.

Harris hawks optimization (HHO) is another recent meta-heuristic algorithm inspired by Harris’s cooperative manner and chasing behavior. The performance of HHO has been improved by integrating it with various optimization techniques like opposition-based learning, Chaotic Local Search, and a self-adaptive technique in Hussien and Amin (2021). Wang et al. (2021) also have tried to enhance the HHO searching performance for global optimization by developing a hybrid algorithm that combines HHO with Aquila Optimizer (AO).

In the same context, Long et al. have developed a modified version of the Butterfly optimization algorithm BOA with adaptive gbest-guided search strategy and pinhole-imaging-based learning to overcome the problem of local optimum, which may occur when solving high dimensional optimization problems (Long et al. 2021). This proposed algorithm (PIL-BOA) has been investigated on 23 classical benchmark test functions, 30 complex benchmark functions of IEEE CEC2014, 30 latest benchmarks from CEC 2017, and 21 feature selection problems. Also, EL-Hasnony et al. (2021) have modified the butterfly algorithm by combining it with the PSO algorithm to boost its global optimization performance. In this study, the authors investigated the performance of the proposed algorithm on the COVID-19 dataset. Chaotic Local Search and Opposition-based have also been integrated with to butterfly optimization algorithm to gain the most optimal or near-optimal results in Assiri (2021).

Whale optimization algorithm (WOA) based on simulating Humpback Whales’ behavior in their manner in food searching and migration has also been combined with a modified conjugate gradient algorithm in Khaleel and Mitras (2020). This hybrid algorithm is based on deriving a new conjugate coefficient to enhance the efficacy of global optimization problem-solving. In another context, WOA has been used to enhance other optimization algorithms due to its strong global search ability, like in Che and He (2021). This study integrated the WOA with the Seagull optimization algorithm (SOA) and presented a modified version of SOA called WSOA. Thermal exchange optimization was another optimization algorithm combined with SOA to enhance its exploitation ability and solve feature selection problems Jia et al. (2019). Several other types of research have presented the hybridization between various optimization techniques such as chaotic crow search and particle swarm optimization algorithm in Adamu et al. (2021), sine cosine algorithm and cuckoo search in Khamees and Al-Baset (2020), and Firefly algorithm and differential evolution (Zhang et al. 2016).

According to the various mentioned studies’ findings, optimization algorithms still worthwhile need to be developed to enhance the exploitation ability and solve global optimization problems like tardy convergence, low computational accuracy, and falling in local optima. Table 1 displays the recent research that applied the idea of hybridization to enhance metaheuristic optimization algorithms and solve the feature selection problem. This paper presents a new approach to pick the most informative features by boosting the performance of the Sooty Tern Optimization algorithm (STOA) and hybridizing it with the Ant Lion algorithm (ALO).

Table 1.

Recent approaches of hybrid optimization techniques

References	Utilized algorithms	Year
Zhang et al. (2016)	Firefly algorithm and differential evolution	2016
Sayed et al. (2018)	Salp swarm algorithm and chaotic maps	2018
Jia et al. (2019)	Seagull optimization algorithm with thermal exchange op-timization	2019
Oliva and Elaziz (2020)	Brainstorm optimization algorithm, chaotic maps, and opposition-based learning	2020
Khamees and Al-Baset (2020)	Sine cosine algorithm and cuckoo search	2020
Khaleel and Mitras (2020)	Whale optimization algorithm with modified conjugate gra-dient algorithm	2020
Anand and Arora (2020)	Chaotic search and Selfish Herd Optimizer	2020
Arora et al. (2020)	Interior search algorithm and chaos theory	2020
Hussien and Amin (2021)	Harris hawks optimization, opposition-based learning, chaotic local search, and self-adaptive technique	2021
Wang et al. (2021)	Harris hawks optimization with Aquila optimizer	2021
Long et al. (2021)	Butterfly optimization algorithm and Pinhole-imaging-based learning	2021
EL-Hasnony et al. (2021)	Butterfly algorithm with PSO	2021
Assiri (2021)	Butterfly algorithm, chaotic local search, and opposition-based learning	2021
Che and He (2021)	Whale optimization with Seagull optimization algorithm	2021
Adamu et al. (2021)	Chaotic crow search and PSO	2021

Parameter name	Problem	Value
Population size (N)	CEC2020	30
Population size (N)	Feature selection	30
Max iterations (tmax)	CEC2020	3000
Max iterations (tmax)	Feature selection	100
Problem dimension (dim)	CEC2020	10 and 20
Problem dimension (dim)	Feature selection	Dataset features
Number of independent runs	CEC2020	30
Number of independent runs	Feature selection	30

Algorithms	Parameters setting
PSO	$wMax = 0.9$ , $wMin = 0.1$ (Default)
GWO	a decreases linearly from 2 to 0
HHO	$beta = 1.5$ (Default)
MFO	$b = 1$ and a decreases linearly from $- 1$ to $- 2$ (Default)
MPA	$FADs = 0.2$ , $P = 0.5$ , $β = 1.5$
ALO	–
STOA	$C_{f} = 2, C_{B} \in [0, 0.5], u, v = 1$

No.	Function description	Fi*
Unimodal function
F1	Shifted and rotated Bent Cigar function	100
Multimodal shifted and rotated functions
F2	Shifted and rotated Schwefel’s function	1100
F3	Shifted and rotated Lunacek bi-Rastrigin function	700
F4	Expanded Rosenbrock’s plus Griewangk’s function	1900
Hybrid functions
F5	Hybrid function 1 ( $N = 3$ )	1700
F6	Hybrid function 2 ( $N = 4$ )	1600
F7	Hybrid function 3 ( $N = 5$ )	2100
Composition functions
F8	Composition function 1 ( $N = 3$ )	2200
F9	Composition function 2 ( $N = 4$ )	2400
F10	Composition function 3 ( $N = 5$ )	2500

Function	Measures	PSO	GWO	HHO	MFO	MPA	ALO	STOA	ST-AL
F1	Best	100.826	1512.383	97240.3	137.9296	100.6736	102.9929	195,835.9	100
	Worst	4936.871	3.28E+08	494,935.9	1.42E+09	12,734.87	12,356.59	7.83E+08	100
	Mean	1676.327	17,447,639	255,933.6	88,533,559	6574.686	2190.297	1.92E+08	100
	Std	1434.075	73,156,598	96,382.65	3.15E+08	4574.322	3082.667	2.49E+08	0
F2	Best	1342.317	1108.298	1571.98	1204.36	1116.859	1419.401	1568.529	1115.36157
	Worst	2300.557	2228.222	2383.975	2641.262	1837.691	2260.379	2200.763	1490.702
	Mean	1747.702	1543.441	1949.897	2078.011	1451.227	1838.022	1902.805	1273.42804
	Std	290.0656	258.5856	240.7425	361.8118	187.4426	228.5664	174.2322	100.063159
F3	Best	716.9109	718.62	746.467	717.3873	712.7313	719.7333	720.7049	712.142278
	Worst	741.807	747.2259	821.3927	753.9382	721.5218	758.4034	781.3539	725.842949
	Mean	726.4704	728.6477	783.6794	733.184	716.7477	740.1851	748.8464	716.590131
	Std	6.466513	8.060786	20.39673	9.844801	2.781191	11.96006	12.72752	2.89892234
F4	Best	1900.574	1900.51	1902.453	1900.8	1900.203	1900.506	1900.992	1900.3827
	Worst	1901.93	1903.119	1911.646	1960.669	1901.11	1902.07	1906.199	1901.58745
	Mean	1901.09	1901.502	1906.15	1905.063	1900.583	1901.16	1903.086	1900.83892
	Std	0.376943	0.775824	2.21147	13.15188	0.22342	0.487497	1.351234	0.30449676
F5	Best	2095.018	2870.067	2831.059	4948.331	2093.779	2232.061	4228.143	1700
	Worst	10,125.04	346,987.8	101,688.2	154,875.6	12,427.31	13,213.51	36,828.5	1704.97479
	Mean	5003.836	39,555.51	38,316.17	28,684.16	6790.615	6622.3	11,604.37	1700.92772
	Std	2693.21	105,163.4	39,034.46	34,291.65	3464.369	3417.503	7102.483	1.08564299
F6	Best	1719.861	1601.307	1602.998	1601.538	1600.049	1601.493	1650.931	1600.90527
	Worst	2057.216	1854.354	2016.48	1970.371	1601.141	1955.882	1944.735	1613.98899
	Mean	1817.027	1732.626	1761.277	1809.43	1600.532	1739.135	1749.094	1602.74265
	Std	86.93557	89.95497	90.97596	122.1287	0.334115	93.09908	51.94684	3.66477493
F7	Best	2101.296	2493.738	2648.412	2810.218	2166.543	2723.26	2873.355	2100.04555
	Worst	2817.753	16,101.66	28,462.2	47,264.84	2566.775	23,473.42	17,294.41	2100.86278
	Mean	2317.767	8872.406	9094.916	12,352.6	2316.229	9380.929	7568.5	2100.43022
	Std	168.9784	4527.194	8610.609	11,798.61	106.2277	7138.869	5117.499	0.27717999
F8	Best	2219.911	2301.407	2305.98	2300.795	2220.291	2221.889	2223.742	2215.83467
	Worst	2303.192	2320.279	2327.855	2550.84	2300.842	3238.041	4007.94	2301.17164
	Mean	2297.5	2307.001	2315.111	2325.678	2293.224	2341.933	2865.334	2296.25875
	Std	18.27148	5.52889	5.701036	55.53402	21.8177	212.3494	641.6428	18.9342359
F9	Best	2500	2722.003	2500.918	2749.388	2500.002	2500	2737.275	2500
	Worst	2780.04	2764.887	2923.949	2792.254	2748.407	2779.331	2776.437	2600
	Mean	2715.859	2740.05	2809.683	2767.348	2515.143	2728.255	2751.611	2505
	Std	93.77332	10.24654	84.81685	12.14551	23.94911	78.75515	9.611748	22.3606798
F10	Best	2897.836	2898.292	2717.372	2898.384	2897.94	2897.757	2898.754	2897.74287
	Worst	2949.504	3024.415	3024.521	2978.48	2949.906	2951.041	3024.674	2897.74287
	Mean	2935.808	2933.8	2924.869	2939.017	2927.937	2928.899	2933.125	2897.74287
	Std	19.37901	27.88147	59.08303	27.01963	23.94911	23.03498	26.03043	9.3312E−13

Function	Measures	PSO	GWO	HHO	MFO	MPA	ALO	STOA	ST-AL
F1	Best	149.0675	9169.883	1498054	9938.403	429.5645	121.0709	1.18E+09	137.039303
	Worst	6076.59	2.71E+09	4004630	8.22E+09	11415.45	4758.852	5.14E+09	11797.7427
	Mean	1910.775	6.04E+08	2773719	2.31E+09	4802.742	1412.314	3.22E+09	5067.13731
	Std	1920.911	7.43E+08	716443.9	2.38E+09	3628.375	1215.155	1.4E+09	4540.32487
F2	Best	1468.85	1677.007	1854.2	2173.439	1798.769	2788.219	2516.219	1244.57253
	Worst	3631.735	3485.399	3391.141	4795.42	3548.621	4006.389	3716.063	1929.16945
	Mean	2684.459	2428.321	2459.846	3076.589	2535.579	3427.677	3146.28	1603.27468
	Std	680.6819	488.4162	452.0179	795.4393	568.8346	415.6953	364.3663	184.673132
F3	Best	749.1345	753.8559	812.0693	751.3783	728.6476	792.3075	835.9468	728.414091
	Worst	805.7087	810.3985	936.0164	1074.894	749.5001	896.3071	939.8215	761.631671
	Mean	772.9227	770.7944	899.7641	837.214	737.09	834.092	873.2247	742.661136
	Std	17.23145	15.32764	36.46678	101.9656	6.773366	35.01951	28.97488	9.85814681
F4	Best	1901.425	1902.745	1914.023	1905.823	1901.422	1903.54	1919.641	1902.28299
	Worst	1905.133	1950.59	1931.732	22,523.83	1902.599	1907.686	2671.891	1904.99868
	Mean	1902.894	1916.953	1921.658	7707.941	1902.007	1904.849	2098.6	1903.37977
	Std	1.012403	15.32228	6.07432	7939.735	0.36723	1.562298	236.5469	0.80847691
F5	Best	4576.084	45,510.86	43,245.39	4897.889	1734.605	12,376.85	33,292.3	1718.88259
	Worst	151,338.3	1,563,541	492,134.1	5,162,088	2203.713	253,429	486,787.9	1982.4812
	Mean	58,998.96	693,320.8	237,767.3	888,330	1919.761	111,336.6	269,894.8	1849.60918
	Std	42,310.16	550,428.2	132,439.6	1,429,444	113.9693	70,004.62	165,169.1	73.0268531
F6	Best	1602.271	1654.922	1894.872	1764.33	1602.207	1668.604	1822.076	1602.05363
	Worst	2313.125	1958.078	2312.84	2338.96	1720.192	2674.598	2489.339	1613.58199
	Mean	1935.191	1864.35	2090.694	2031.482	1612.784	2242.032	2065.039	1605.42989
	Std	177.8016	81.43428	123.0286	167.3889	23.71376	301.072	208.3496	3.73490572
F7	Best	3893.95	32,677.23	12,044.76	24,277.9	2102.306	4249.516	14,138.56	2101.59872
	Worst	161,212.1	249,374.2	455,508	1,198,037	2280.503	316,732.5	228,195.7	2234.34465
	Mean	28,323	135,952.6	115,152.6	299,970.3	2165.784	73,687.46	90,663.87	2136.33553
	Std	44,782.82	69,725.22	122,523.3	391,212.4	58.78263	106,741.1	73,679.27	45.0973721
F8	Best	2300	2310.62	2311.782	2301.171	2300.004	2300	2524.337	2300.01914
	Worst	5613.677	4339.147	6005.18	5964.614	2313.576	4763.698	6296.554	5138.35431
	Mean	3016.317	2819.914	3159.088	4041.776	2303.359	2692.017	5322.392	3145.49128
	Std	1304.161	753.7906	1536.643	1621.218	3.892559	915.7107	981.0725	1255.01527
F9	Best	2852.118	2821.487	2965.218	2837.652	2810.925	2852.385	2847.739	2810.61914
	Worst	3007.874	2916.136	3353.093	2945.812	2835.372	2927.431	2906.623	2841.71589
	Mean	2901.121	2857.805	3172.954	2885.383	2823.501	2887.074	2868.628	2821.48244
	Std	48.29315	29.57164	113.3423	25.77292	9.00791	25.40788	18.77983	8.18603102
F10	Best	2910.509	2924.751	2925.859	2910.67	2910.198	2914.069	2956.969	2910.22865
	Worst	3000.437	3181.11	3002.534	3169.652	2914.002	2999.653	3181.348	2913.82748
	Mean	2949.443	3027.891	2975.48	2961.343	2913.231	2970.505	3024.626	2913.14968
	Std	33.8739	81.12437	21.64809	75.05806	1.412285	23.04355	57.18155	1.3075439

ST-AL vs.	PSO	GWO	HHO	MFO	MPA	ALO	STOA
F1	8.007E−09	8.00655E−09	8.00655E−09	7.7176E−09	8.00655E−09	8.00655E−09	8.00655E−09
F2	2.563E−07	0.000247061	6.79562E−08	7.94795E−07	0.000686822	9.17277E−08	6.79562E−08
F3	1.576E−06	3.41558E−07	6.79562E−08	3.93881E−07	0.797197419	1.23464E−07	7.89803E−08
F4	0.0179386	0.003638826	6.79562E−08	4.54008E−06	0.00604033	0.033717669	2.95975E−07
F5	6.796E−08	6.79562E−08	6.79562E−08	6.79562E−08	6.79562E−08	6.79562E−08	6.79562E−08
F6	6.796E−08	1.57567E−06	9.17277E−08	6.0148E−07	1.43085E−07	4.53897E−07	6.79562E−08
F7	6.796E−08	6.79562E−08	6.79562E−08	6.79562E−08	6.79562E−08	6.79562E−08	6.79562E−08
F8	9.748E−06	6.79562E−08	6.79562E−08	1.65708E−07	0.010581211	1.25052E−05	0.000115901
F9	1.103E−07	3.37272E−08	3.94662E−08	3.37272E−08	3.37272E−08	4.61473E−08	3.37272E−08
F10	8.007E−09	8.00655E−09	2.10246E−07	8.00655E−09	8.00655E−09	8.00655E−09	8.00655E−09

ST-AL vs.	PSO	GWO	HHO	MFO	MPA	ALO	STOA
F1	0.0970911	9.0734E−06	3.39182E−06	6.13704E−06	0.966914777	0.042110617	3.39182E−06
F2	0.5067205	0.839859973	0.750831884	0.053097957	6.00576E−05	0.001353941	0.008615558
F3	0.0002462	0.000123346	3.65846E−05	0.000123346	0.193930852	3.65846E−05	3.65846E−05
F4	0.1939309	0.000384202	3.65846E−05	3.65846E−05	9.73457E−05	0.010193105	3.65846E−05
F5	0.0120228	0.035089116	0.544370146	0.068964333	3.65846E−05	0.174853307	0.370844333
F6	0.000592	3.65846E−05	3.65846E−05	3.65846E−05	3.65846E−05	3.65846E−05	3.65846E−05
F7	0.0086156	0.112351198	0.707453968	0.140955219	3.65846E−05	0.260236203	0.839859973
F8	9.75E−03	0.193930852	0.126022122	0.014137969	0.795012172	0.088533772	0.000155796
F9	3.658E−05	0.000592042	3.65846E−05	4.69487E−05	0.623604884	3.65846E−05	3.65846E−05
F10	0.0035498	3.65846E−05	3.65846E−05	0.000384202	0.019373319	3.65846E−05	3.65846E−05

Datasets	Features	Samples	Classes	Category
Low dimensional datasets
Exactly	13	1000	2	Biology
Exactly2	13	1000	2	Biology
Lymphography	18	148	2	Biology
SpectEW	22	267	2	Biology
CongressEW	16	435	2	Politics
IonosphereEW	34	351	2	Electromagnetic
Vote	16	300	2	Politics
WineEW	13	178	3	Chemistry
BreastEW	30	569	2	Biology
PenglungEW	325	73	2	Biology
SonarEW	208	60	2	Biology
HeartEW	13	270	2	Biology
M-of-n	13	1000	2	Biology
Zoo	16	101	6	Artificial
High dimensional datasets
base_Brain_T21	10,367	50	4	Biology
base_leuk1	11,225	72	3	Biology

Dataset	Measures	PSO	GWO	HHO	MFO	MPA	ALO	STOA	ST-AL
Low dimensional datasets
Exactly	Mean	0.07978	0.01844	0.01315	0.00837	0.0046	0.21749	0.150503269	0.004615
Exactly	STD	0.11564	0.061812	0.020916	0.016775	1.78E−18	0.122488	0.149887136	1.78E−18
Exactly2	Mean	0.20945	0.20929	0.20441	0.20182	0.2021	0.21128	0.209408269	0.197752
Exactly2	STD	0.008543	0.006572	0.006572	0.006826	0.005229	0.004863	0.007001712	0.0041013
Lymphography	Mean	0.06994	0.05381	0.05508	0.05278	0.05174	0.08931	0.070816442	0.039141
Lymphography	STD	0.027083	0.024516	0.018863	0.019668	0.015403	0.034336	0.021362251	0.00728
SpectEW	Mean	0.09338	0.08129	0.0789	0.07798	0.0756	0.09786	0.09380303	0.075629
SpectEW	STD	0.015687	0.008927	0.006517	0.00568	0.000102	0.018575	0.015655563	0.0001016
CongressEW	Mean	0.02545	0.02049	0.01846	0.01757	0.01622	0.0241	0.024602371	0.014986
CongressEW	STD	0.006262	0.006182	0.005452	0.004817	0.00434	0.007149	0.007492906	0.0033024
IonosphereEW	Mean	0.03514	0.01966	0.03119	0.02389	0.0174	0.04109	0.03392937	0.018379
IonosphereEW	STD	0.015475	0.006458	0.0099	0.007166	0.004996	0.015815	0.013644524	0.0056419
Vote	Mean	0.00733	0.00331	0.00328	0.00341	0.00328	0.00704	0.00585	0.003156
Vote	STD	0.00836	0.000458	0.000344	0.000378	0.000569	0.006098	0.005509949	0.0001398
WineEW	Mean	0.00839	0.00318	0.00219	0.00223	0.0015	0.00705	0.006317308	0.001692
WineEW	STD	0.01481	0.006476	0.000943	0.00093	4.45E−19	0.010643	0.010596922	0.0005353
BreastEW	Mean	0.04517	0.03872	0.04261	0.03857	0.03942	0.04565	0.047908772	0.037335
BreastEW	STD	0.004681	0.006352	0.003666	0.003019	0.004017	0.005502	0.005164334	0.0030016
PenglungEW	Mean	0.15457	0.12199	0.15683	0.14906	0.0703	0.15284	0.075696648	0.146121
PenglungEW	STD	0.016478	0.03174	0.01945	0.007585	0.052681	0.027583	0.0635006	0.0033567
SonarEW	Mean	0.04266	0.0078	0.049	0.01249	0.01132	0.07172	0.05314881	0.009796
SonarEW	STD	0.017705	0.012851	0.017735	0.011743	0.011648	0.026809	0.024743855	0.0134368
HeartEW	Mean	0.20116	0.19378	0.19485	0.19161	0.19088	0.20746	0.198858974	0.188513
HeartEW	STD	0.011911	0.008557	0.00889	0.00801	0.007765	0.013599	0.008202481	0.0057219
M-of-n	Mean	0.00995	0.00977	0.00469	0.0046	0.0046	0.02921	0.030564423	0.004615
M-of-n	STD	0.014041	0.023072	0.000237	1.78E−18	1.78E−18	0.025765	0.046349329	1.78E−18
Zoo	Mean	0.00576	0.00219	0.00197	0.00247	0.0016	0.00384	0.0023125	0.002156
Zoo	STD	0.011085	0.000555	0.000583	0.000474	0.00043	0.001394	0.000577113	0.000516
High dimensional datasets
base_Brain_T21	Mean	0.367752	0.19934	0.209182	0.351421	0.25074	0.198704	0.198003858	0.104568
base_Brain_T21	STD	0.046541	0.000293	0.015412	0.069951	0.007727	0.000334	0.070133166	1.364E−06
base_leuk1	Mean	0.07072	0.00117	0.00039	0.0709	0.00328	0.00013	1.24722E−05	1.16E−05
base_leuk1	STD	0.09338	6.87E−05	0.000173	0.093281	0.000229	1.64E−05	1.13389E−05	1.008E−05

Dataset	ISOA	WOASA	SCHHO	GWOPSO	ASGW	GWOCrowSA	ST-AL
Exactly	1	1	0.812	1	0.999	0.99	1
Exactly2	0.7686	0.75	0.783	0.76	0.777	0.746	0.806
Lymphography	0.9252	0.89	0.97	0.92	0.884	0.87	0.9643
SpectEW	0.906	0.88	0.887	0.88	0.87	0.816	0.9259
CongressEW	0.985	0.98	0.97	0.98	0.97	0.963	0.9874
IonosphereEW	0.97	0.966	0.947	0.95	0.972	0.915	0.9831
Vote	0.985	0.97	0.987	0.97	0.984	0.948	1
WineEW	1	0.99	0.994	1	1	0.982	1
BreastEW	0.976	0.985	0.981	0.97	0.981	0.962	0.9658
PenglungEW	–	0.94	–	0.96	1	0.8595	0.856
SonarEW	0.9736	0.97	–	0.96	0.948	0.9058	0.9929
HeartEW	–	0.85	–	0.85	0.831	0.8326	0.8129
M-of-n	1	1	–	1	1	0.996	1
Zoo	1	0.97	–	1	1	0.9686	1
base_Brain_T21	–	–	–	–	–	–	0.894
base_leuk1	–	–	–	–	–	–	1

Dataset	ISOA	WOASA	SCHHO	GWOPSO	ASGW	GWOCrowSA	ST-AL
Exactly	6.89	6	4.43	6	6.87	6.4	5.5
Exactly2	3	1	2.07	1.6	7.93	4.6	7.4
Lymphography	7.62	6.8	2.23	9.2	11.2	8	6.95
SpectEW	8.43	9.6	6.23	8.4	10.17	8	4.7
CongressEW	5.6	4.4	2.23	4.4	8.83	5	3.95
IonosphereEW	8.4	11.4	4.27	13	17.3	13	5.6
Vote	7.25	5.8	3.7	3.4	8.97	4.6	5.05
WineEW	6.6	6.8	2.73	6	7.6	6.4	2.2
BreastEW	7.58	13.6	7.67	13.6	15.83	13.8	10.4
PenglungEW	–	325	–	130.8	170.3	165.8	117.15
SonarEW	20	60	–	31.2	35.3	29.6	16.35
HeartEW	–	13	–	5.8	6.367	5	4.35
M-of-n	7	13	–	6	6.867	6.4	5.95
Zoo	9.33	16	–	6.8	7.6	5.2	2.5
base_Brain_T21	–	–	–	–	–	–	70.5
base_leuk1	–	–	–	–	–	–	13

PERMALINK

ST-AL: a hybridized search based metaheuristic computational algorithm towards optimization of high dimensional industrial datasets

Reham R Mostafa

Noha E El-Attar

Sahar F Sabbeh

Ankit Vidyarthi

Fatma A Hashim

Abstract

Introduction

Related works

Table 1.

Preliminary study about algorithms

Ant lion optimization (ALO)

Sooty tern optimization algorithm (STOA)

Proposed hybrid ST-AL optimization algorithm

Fig. 1.

Performance evaluation of the proposed ST-AL

Table 2.

Table 3.

Performance measures

Experimental series 1: CEC’2020 test suite

Table 4.

Statistical results analysis

Table 5.

Table 6.

Table 7.

Table 8.

Convergence behavior analysis

Fig. 2.

Boxplot behavior analysis

Fig. 3.

Exploration–exploitation analysis

Fig. 4.

Experimental series 2: feature selection problems

Table 9.

Results and discussion of UCI datasets

Table 10.

Fig. 5.

Table 11.

Fig. 6.

Table 12.

Fig. 7.

Table 13.

Fig. 8.

Fig. 9.

Fig. 10.

Comparison with the state-of-the-art feature selection methods

Table 14.

Table 15.

Conclusions and future work

Funding

Data availability

Declarations

Conflict of interest

Ethical approval

Footnotes

Contributor Information

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases