A block matrix incremental feature selection method based on fuzzy rough minimum classification error

Zhanwei Chen; Minggang Xing; Juan Li

doi:10.1038/s41598-025-33384-x

. 2026 Jan 8;16:3426. doi: 10.1038/s41598-025-33384-x

A block matrix incremental feature selection method based on fuzzy rough minimum classification error

Zhanwei Chen ¹, Minggang Xing ², Juan Li ^1,^✉

PMCID: PMC12835165 PMID: 41507248

Abstract

In fuzzy rough set models, inner-product correlation serves as an effective evaluation function for feature selection, with its key advantage lying in its ability to characterize the minimum classification error inherent in the model. However, existing inner-product-based methods typically rely only on a subset of samples to approximate this error, making it difficult to fully and accurately capture the discriminative structure across the entire sample space. To address this limitation, this paper proposes a global-sample-oriented inner-product correlation criterion. By constructing a continuous and non-vanishing fuzzy membership structure over the entire universe of discourse, the proposed criterion significantly enhances the theoretical soundness and practical consistency of inner-product-based feature evaluation. Building upon this foundation and leveraging efficient matrix computation techniques, we design a static feature selection algorithm based on the Minimum Classification Error-based Feature Selection (MCEFS) criterion. Furthermore, to meet the demand for efficient updates in dynamic data environments, we develop a block-wise updating mechanism for the fuzzy decision and fuzzy relation matrices. We rigorously derive and prove a block-based incremental update strategy for the fuzzy lower approximation matrix, which effectively eliminates redundant recomputation of fuzzy lower approximations and substantially improves computational efficiency. Based on this strategy, we propose an incremental feature selection algorithm—Block Matrix-based MCEFS (BM-MCEFS). Finally, comprehensive comparative experiments on 12 public benchmark datasets validate the effectiveness and feasibility of the static MCEFS algorithm and clearly demonstrate the superior performance of BM-MCEFS in terms of computational efficiency and numerical stability.

Keywords: Fuzzy rough set, Feature selection, Block matrix, Incremental mechanisms, Fuzzy decision

Subject terms: Data mining, Computer science, Computational science

Introduction

In today’s data-driven world, high-dimensional datasets are becoming increasingly common, yet traditional learning algorithms often fail to process them effectively. As a result, feature selection methods, which can eliminate redundant information, have garnered widespread attention. By identifying a representative subset of features that retains the essential characteristics of the original data, feature selection simplifies the subsequent data analysis process. Currently, feature selection has become an important research topic in the field of pattern recognition^1,2 and machine learning^3–5, and has been widely used in practice^6–14. The fuzzy rough set theory proposed by Dubois and Prade¹⁵ is an important tool for feature selection using uncertainty in data^16–18. In recent years, it has attracted the attention of many researchers^19,20. Alnoor et al.²¹ proposed an application method based on Linear Diophantine Fuzzy Rough Sets and Multicriteria Decision-Making Methods, which can effectively identify oil transportation activities. Riaz et al.²² proposed linear Diophantine fuzzy sets (LDFS), which introduce reference parameters to constrain the membership and non-membership degrees, thereby demonstrating greater flexibility and robustness in multi-criteria decision-making. Yang et al.²³ proposed a fuzzy rough set method that is aware of noise for feature selection. Ye et al.²⁴ proposed a fuzzy rough set model for multi-attribute decision making for feature selection in multi-label learning. He et al.²⁵ studied the selection of features of incomplete decision information systems based on fuzzy rough sets. Zhang et al.²⁶ redefined the fuzzy rough set model of fuzzy cover based on the fuzzy rough set theory, providing a new thinking direction for the fuzzy rough set theory. Deng et al.²⁷ conducted a theoretical analysis of the fuzzy rough set and proposed a feature selection algorithm that combines the distribution of labels. With the continuous deepening of the research, some researchers select features by constructing a fuzzy rough set feature evaluation function. Wang et al.²⁸ introduced the distance measure into the fuzzy rough set and studied the calculation model of the iterative evaluation function based on variable distance parameters. Zhang et al.²⁹ used information entropy to measure uncertainty for feature selection. Qian et al.³⁰ proposed a label distribution feature selection algorithm based on mutual information. Qiu et al.³¹ studied the hierarchical feature selection method based on the Hausdorff distance. Sun et al.³² studied the fuzzy rough set online flow evaluation function for feature selection. However, An et al.³³ proposed a feature selection method based on a rough relative fuzzy approximation of the maximum positive region by defining a relative fuzzy dependency function to evaluate the importance of features for decision-making. Liang et al.³⁴ proposed a robust feature selection method based on the similarity of the kernel function and the relative classification uncertainty measure by using K nearest neighbor and Bayesian rules to generate an uncertainty measure. Zhang et al.³⁵ enhanced the data fitting ability of fuzzy rough set theory by adopting an adaptive learning mechanism, and thus proposed a feature selection algorithm based on adaptive relative fuzzy rough set. Chen et al.³⁶ conducted research on multi-source data and proposed an algorithm for fusion and feature selection by minimizing entropy to eliminate redundant features.

In general, the aforementioned approaches primarily construct dependency functions by extracting the maximum fuzzy membership degree of each sample with respect to the decision classes–i.e., preserving the maximal fuzzy positive region–to evaluate the importance of feature subsets. However, such methods utilize only the maximum membership value during data analysis and neglect the potentially valuable discriminative information embedded in the non-maximal membership degrees. To address this limitation, Wang et al.³⁷ proposed a feature selection method based on the minimum classification error grounded in Bayesian decision theory, which posits that a smaller overlap between class-conditional probability density curves leads to a lower classification error. Their approach computes inner products between fuzzy membership degrees (excluding samples with zero membership values) to characterize this minimal classification error. Nevertheless, when inner-product correlations are computed solely on the basis of a subset of samples (i.e., partial samples), two critical issues arise. First, fuzzy membership functions derived from partial samples are often discontinuous or exhibit abrupt jumps over the global universe of discourse. This violates the continuity assumption that underlies the “minimal overlap Inline graphic minimal error” principle in Bayesian analysis, thus compromising its theoretical foundation and significantly degrading the reliability of the classification performance. Second, during feature evaluation, if inner-product correlations are computed from a subset of samples while the fuzzy positive region is estimated using the entire sample set, the two components rely on inconsistent sample bases. Specifically, under a given feature, the number of samples used to compute inner-product correlations may differ from that used to calculate the fuzzy positive region. In such cases, combining the inner-product correlation with the fuzzy positive region to compute the incremental dependency–and subsequently using this metric to assess the importance–lacks both reasonableness and consistency.

To more fully exploit the latent discriminative information inherent in the minimum classification error criterion and enhance the capability of inner-product correlation for feature selection, this paper makes the following contributions: Firstly, a non-zero fuzzy similarity relation function is constructed, and the decision information is fuzzified based on the class center sample strategy; Secondly, a continuous fuzzy membership degree curve is constructed on the universe of discourse based on the fuzzy membership function, and the degree of overlap between the fuzzy membership degree curves of different feature subsets is quantified using the inner product correlation, thereby enhancing the model’s screening ability for features. Finally, drawing on the matrix operation strategy, a matrix generation strategy for fuzzy membership degrees and inner product correlations is proposed to improve the computational efficiency of the algorithm. Based on the above research, this paper designs a feature selection algorithm based on the minimum classification error (Minimum Classification Error-based Feature Selection, MCEFS).

However, as data environments evolve, continuing to use static feature selection methods may result in a large number of redundant computations, thereby reducing the computational efficiency of the algorithm. Incremental feature selection methods, which leverage prior knowledge to select features from dynamically changing data, have thus garnered significant attention from researchers^40–43. Sang et al.⁴⁴ studied the incremental feature selection method for ordered data with dynamic interval values. Wang et al.⁴⁵ proposed an incremental fuzzy tolerance rough set method for intuitionistic fuzzy information systems by updating the rough approximation of fuzzy tolerance. Zhang et al.⁴⁶ proposed a novel incremental feature selection method using sample selection and accelerators based on the discriminative score feature selection framework. Yang et al.⁴⁷ proposed an incremental feature selection method for interval-valued fuzzy decision information systems by studying two related incremental algorithms for sample insertion and deletion. Zhao et al.⁴⁸ proposed a two-stage uncertainty measurement and designed an incremental feature selection algorithm capable of handling incomplete stream data. Xu et al.³⁸ proposed a matrix-based incremental feature selection method based on weighted multi-granularity rough sets by minimizing the loss function to obtain the optimal weight vector, effectively improving the efficiency of feature selection algorithms. Zhao et al.³⁹ based on fuzzy rough set theory, proposed a consistency principle to evaluate the significance of the feature, and combined with the principle of representative samples, designed three acceleration algorithms for incremental feature selection.(Table 1 presents a detailed comparison).

Table 1.

Comparison of incremental feature selection methods.

	This study	Xu et al.³⁸	Zhao et al.³⁹
Tool type	Fuzzy rough set	Multigranulation rough set	Fuzzy rough set
Relation type	Fuzzy relation	Crisp relation	Fuzzy relation
Feature selection mechanism	The minimum classification error	Optimal neighborhood knowledge granularity weights	-level consistency region -level membership degree -level reduce
Evaluation Method	Fuzzy dependency Inner product dependency	Conditional entropy feature weights (positive correlation , negative correlation )	Consistency approximation importance measure Fuzzy dependency
Incremental mechanism	Block updating of the fuzzy relation matrix and fuzzy decision matrix Block updating of the fuzzy lower approximation matrix	Update of the neighborhood relation matrix Update of the decision matrix updates positive and negative domain vectors	Acceleration strategy Check whether the new samples have any impact on the original acceleration method
Computational complexity

Open in a new tab

In summary, most existing incremental methods primarily focus on updating strategies for sample relationships when samples change, while paying less attention to updating strategies for fuzzy decisions and the generation of fuzzy rough approximations. To reduce redundant calculations of fuzzy rough approximations and enhance the efficiency of incremental algorithms, this paper has carried out the following work: First, we designed fuzzy relation and fuzzy decision matrices based on a block matrix updating strategy. Second, through an in-depth analysis of the calculation method for fuzzy lower approximations, we developed a block-based updating method for fuzzy lower approximation matrices to reduce redundant calculations of fuzzy approximations. Finally, we proposed an incremental feature selection algorithm based on block matrices (BM-MCEFS) to improve the algorithm’s adaptability to dynamic data environments.

The paper is organized as follows. Section “Preliminaries” introduces the fundamental concepts of fuzzy rough sets and fuzzy decision systems. In Section “A fuzzy rough model based on minimum classification error”, the proposed fuzzy similarity relation is incorporated into fuzzy decision information, based on which an inner-product dependency function is constructed and a corresponding static feature-selection algorithm is designed by matrix operations. Section “Incremental method based on block matrix” presents an incremental feature-selection algorithm that realizes dynamic updates using block-matrix techniques. Experimental datasets and an in-depth analysis of the results are provided in Section “Experimental results and analysis”. Finally, Section “Conclusions” summarizes the paper, discusses its limitations, and outlines directions for future research.

Preliminaries

To facilitate understanding of the subsequent content of this paper, this section provides a brief review of concepts related to fuzzy rough sets that are pertinent to this study.

Definition 1

⁴⁹ Let the triplet (U, A, D) represent a decision table, in which Inline graphic represents a non-empty finite domain of discourse, is a condition set, is a decision set. If A is a mapping to U, , then A is called a fuzzy set on U, in which A(X) represents the degree of membership of X to A, therefore the fuzzy set in the domain U can be denoted as F(U), that is, Inline graphic .

Definition 2

⁵⁰ Let U be the domain Inline graphic . If R represents the fuzzy similarity relation of any sample with respect to the conditional feature a, then it satisfies the following properties:

Reflexivity: ,
Symmetry: .

If there exists a feature subset Inline graphic , then the fuzzy similarity relation on B is defined as , that is, . The fuzzy granularity of the sample .

Definition 3

⁵¹ Let (U, A, D) be a decision table, Inline graphic , , if is the fuzzy subset, then the fuzzy lower and upper approximations in B are respectively defined as follows:

In which “ Inline graphic ” and “” denote the maximum and minimum operations, respectively. represents the degree of certainty that the sample is subordinate to P; represents the degree of probability that the sample is subordinate to P; then is called a pair of fuzzy approximation operators of P.

Definition 4

⁵² Let (U, A, D) be a decision table, where Inline graphic ; decision D be divided into r crisp equivalence classes in the domain U, that is, . For any sample , if , then the fuzzy lower and upper approximation can be simplified as:

Definition 5

⁵³ Let (U, A, D) be a decision table, Inline graphic , , For any , its degree of membership in the fuzzy positive region is defined as:

The fuzzy rough dependency of the decision D on the feature subset B is defined as:

In which Inline graphic represents the cardinal number U of the domain. The fuzzy dependency function can be interpreted as the ratio of the cardinality of the fuzzy positive region to the total number of samples. In the theory of fuzzy rough sets, it is commonly used to evaluate the significance of a feature subset.

From Equation 6, it is evident that the fuzzy rough dependence can only retain the maximum dependence between the samples and the decisions. The analysis of the area of overlap between decisions, that is, the minimum classification error, is insufficient. Figure 1 depicts the basic process of feature selection based on classical fuzzy rough sets.

Fig. 1 — The basic concept diagram of fuzzy rough set.

A fuzzy rough model based on minimum classification error

In this section, a method for generating fuzzy lower approximation matrices by incorporating matrix operations is proposed. Furthermore, based on the minimum classification error criterion, a novel inner product relevance function is constructed, and a corresponding static feature selection algorithm is designed accordingly. The flow of the static algorithm is shown in Fig. 2.

Fig. 2 — Static algorithm framework diagram.

Definition 6

Let (U, A, D) be a decision table. If Inline graphic , , then the fuzzy similarity relation of sample is defined as:

in which Inline graphic represents the value of sample under feature a. From Equation 7, it is apparent that satisfies reflexivity, symmetry, and . Alternatively, letting , the fuzzy similarity relation matrix can be expressed as , in which .

The advantage of Equation 7 lies first in its strictly positive nature, which ensures that the induced fuzzy lower approximation is continuous over the entire universe of discourse. Second, its wide mapping range (approximately (0,1]) enables relatively high sensitivity to differences in feature values. As evidenced by Table 3 derived from the decision table in Table 2–Equation 7 covers a wider range within (0, 1) than the similarity relation built with the Gaussian kernel. For the fuzzy similarities between sample Inline graphic and the remaining samples in Table 2, Equation 7 gives a spread of 0.586 (max 0.722 -min 0.136), whereas the Gaussian kernel yields only 0.451. Equation 7 also discriminates samples better than the Euclidean-distance-based similarity: samples and are distinct, yet their Euclidean similarities to Inline graphic are identical, while Equation 7 produces different values and thus distinguishes them effectively.

Table 3.

Comparison table of fuzzy similarity relations constructed based on different distance Functions.

									(Gaussian Kernel)									(Euclidean Distance)
1	0.625	0.722	0.411	0.282	0.21	0.142	0.136	0.111	1	0.968	0.985	0.861	0.799	0.695	0.523	0.517	0.439	1	0.797	0.85	0.646	0.599	0.54	0.467	0.465	0.438
0.625	1	0.508	0.495	0.311	0.226	0.15	0.147	0.118	0.968	1	0.93	0.872	0.794	0.716	0.56	0.532	0.468	0.797	1	0.724	0.656	0.596	0.55	0.481	0.471	0.448
0.722	0.508	1	0.37	0.316	0.236	0.159	0.153	0.124	0.985	0.93	1	0.848	0.828	0.712	0.54	0.532	0.455	0.85	0.724	1	0.636	0.619	0.548	0.474	0.471	0.444
0.411	0.495	0.37	1	0.449	0.419	0.274	0.26	0.213	0.861	0.872	0.848	1	0.884	0.878	0.752	0.758	0.645	0.646	0.656	0.636	1	0.668	0.662	0.57	0.573	0.516
0.282	0.311	0.316	0.449	1	0.612	0.421	0.405	0.311	0.799	0.794	0.828	0.884	1	0.968	0.871	0.857	0.811	0.599	0.596	0.619	0.668	1	0.797	0.655	0.642	0.607
0.21	0.226	0.236	0.419	0.612	1	0.582	0.56	0.426	0.695	0.716	0.712	0.878	0.968	1	0.957	0.945	0.904	0.54	0.55	0.548	0.662	0.797	1	0.771	0.749	0.69
0.142	0.15	0.159	0.274	0.421	0.582	1	0.656	0.674	0.523	0.56	0.54	0.752	0.871	0.957	1	0.969	0.972	0.467	0.481	0.474	0.57	0.655	0.771	1	0.799	0.808
0.136	0.147	0.153	0.26	0.405	0.56	0.656	1	0.574	0.517	0.532	0.532	0.758	0.857	0.945	0.969	1	0.962	0.465	0.471	0.471	0.573	0.642	0.749	0.799	1	0.782
0.111	0.118	0.124	0.213	0.311	0.426	0.674	0.574	1	0.439	0.468	0.455	0.645	0.811	0.904	0.972	0.962	1	0.438	0.448	0.444	0.516	0.607	0.69	0.808	0.782	1

Open in a new tab

Table 2.

Decision table.

					D
0.09	0.16	0.22	0.29	0.27	1
0.26	0.27	0.16	0.15	0.29	1
0.12	0.06	0.25	0.42	0.32	1
0.34	0.34	0.66	0.24	0.36	2
0.46	0.42	0.48	0.67	0.45	2
0.58	0.56	0.64	0.61	0.49	2
0.81	0.69	0.75	0.67	0.54	3
0.61	0.76	0.88	0.71	0.55	3
0.79	0.86	0.74	0.82	0.61	3

Open in a new tab

Definition 7

Let (U, A, D) be a decision table, Inline graphic , , , then the class center sample can be defined as:

by incorporating the class center samples, the corresponding fuzzy decision can be derived from U/D as follows:

in which Inline graphic represents the degree of fuzzy similarity between the sample x and the center sample of the class . For , , , . Let , , the fuzzy decision matrix can be expressed as , in which .

Definition 8

Let (U, A, D) be a decision table, Inline graphic , . is the fuzzy decision corresponding to , the fuzzy lower and upper approximations are defined as:

From Equation 7, we can see Inline graphic and obtain . Since , , , that is, . It can be seen that in the domain U, generates an uninterrupted fuzzy lower approximation membership curve, as shown in Fig. 3.

Fig. 3 — Fuzzy lower approximate membership curves of three decision classes on feature subset B.

According to Inline graphic and , combined with the matrix operation, the fuzzy lower and upper approximation matrix can be converted into the following:

In which Inline graphic represents the reverse of elements in . Let “” denote the union set of each element of each row in and each element of the column corresponding to the matrix , then take the intersection of all the union sets, that is, “” is equivalent to “”. With the same argument: “” represents the operation process “ Inline graphic ”. Here we can get .

Property 1

Let Inline graphic , , U/D represents the clear division of decision on the domain, FD represents fuzzy decision corresponding to U/D, then the following formula holds:

(1) If Inline graphic , then ≤, ≥ ,

(2) If Inline graphic , then , .

Proof

(1) If so Inline graphic , from Equation 3, we can obtain . FD represents the fuzzy decision corresponding to U/D, and from Equation 10, we can obtain . Therefore, we can get , thus . Quod erat demonstrandum.

In the same way, it can be demonstrated that Inline graphic

(2) From Inline graphic we have , that is, , and then we can obtain . From Equation 10 we have , that is, . Quod erat demonstrandum.

In the same way, it can be demonstrated that Inline graphic .

Clearly, from Property 1, Inline graphic satisfies monotonicity, and relative to , the fuzzy lower approximation is expanded.

Definition 9

Let (U, A, D) be a decision table, Inline graphic , , the fuzzy positive region and the fuzzy rough dependence are redefined, respectively, as:

Clearly, combined with Fig. 3, Inline graphic is the solid line part, and the dotted line part represents the overlapping area of the fuzzy lower approximate membership curve, that is, the minimum classification error.

Definition 10

Let (U, A, D) be a decision table, Inline graphic , , , FD be the fuzzy decision corresponding to U/D, the inner product relevance function is then defined as:

in which U is the domain and r is the number of decision categories. Since Inline graphic , can be obtained.

Next, this paper will clearly demonstrate the key role that the inner product dependency function plays through a simple and intuitive example.

Example 1

Let Inline graphic , , . FD represents fuzzy decision corresponding to U/D. If the lower approximation membership degrees of the sample to are = , = ; = , = ; = , = . Obviously, if only based on the maximal fuzzy positive region = and = , then according to the fuzzy dependency function = , can be obtained. At this point, it can be observed that the fuzzy dependency function only retains the maximum fuzzy approximation degree of the sample. At this time, it is impossible to distinguish the feature set B and C. If the inner product-dependent function is used, then we can obtain Inline graphic , and similarly . At this point, it can be observed that the classification capabilities of the feature sets B and C are obviously different.

At this point, it is obvious that the method based on inner product functions can effectively enhance the feature recognition ability.

Theorem 1

Let (U, A, D) be a decision table, Inline graphic , = , if = , then = . Where is an element in the matrix .

Proof

From Definition 8, the transpose matrix of Inline graphic is . Let be an element in the matrix , , it is evident that , and when , can be obtained. Therefore, . Quod erat demonstrandum.

By Theorem 1, the calculation of the inner product dependent function can be transformed into a matrix operation.

Theorem 2

Let (U, A, D) be a decision table. If Inline graphic , then .

Proof

From Inline graphic , , we can get , ≤ . Thus, for , , we can get ≤ , ≤ . Therefore, ≤ , combined with Equation 16, can be obtained. Quod erat demonstrandum.

From Theorem 2, the inner product dependence function satisfies the monotonicity requirement on the domain U.

Theorem 3

Let (U, A, D) be a decision table, Inline graphic , . If , then .

Proof

For Inline graphic , since , , we can get , , . Consequently, ≤ can be obtained, that is, ≤ , From Equation 16, can be obtained. Therefore, for , , = if and only if = , and since = , = , we can get = , then = can be obtained, therefore, when = , = . Quod erat demonstrandum.

Remark 1

Theorem 3 does not necessarily hold in the opposite way. Suppose that when Inline graphic , can be obtained from the fuzzy rough dependence, that is,

clearly, for Inline graphic , can be obtained with the only need of satisfying:

but it can be known from Theorem 3 that when Inline graphic , , Therefore, when , is not necessarily true.

It is evident from the above that the relevance of the inner product encompasses the classification information captured by the fuzzy rough dependency. However, the fuzzy rough dependency cannot fully represent the classification information conveyed by the inner product relevance function.

As shown in Fig. 4, the domain U illustrates the superimposed distributions of the fuzzy lower approximation membership curves for feature subsets H and F. When Inline graphic , the feature subsets H and F exhibit the same classification ability according to fuzzy rough dependency. However, according to Bayesian Decision Theory, a smaller overlap between membership curves corresponds to a lower classification error rate. Therefore, it can be observed from Fig. 4 that in the feature subset F, the overlap area of the membership curve of the fuzzy lower approximation Inline graphic , , is obviously larger than that of the membership curve of , , in the feature subset H. Thus, if , E is , P is , J is , then the inner product is . It follows that in the domain U, the relevance of the inner product can reflect the degree of fuzzy lower approximation overlap of different feature spaces, that is, the effect of minimum classification error on the classification of the data. Therefore, although the feature subsets H and F have the same fuzzy positive region, the feature subset H has a greater classification ability than the feature subset F according to Bayesian Decision Theory.

Definition 11

Let (U, A, D) be a decision table, Inline graphic . If B satisfies the following conditions:

Then B is called a reduction of A, that is, B is the smallest feature subset with the same classification ability as A.

If Inline graphic , then the importance of the feature a to B is defined as:

The numerator Inline graphic represents the increase in fuzzy dependency induced by the addition of the feature a, while the denominator is defined as the square root of the function based on the inner-product . Clearly, the stronger the discriminative power of the feature a, the larger the resulting measure .therefore, the importance measure proposed in this formulation jointly accounts for both the increase in fuzzy dependency and the minimization of the classification error (as reflected by the term of inner-product). Consequently, Equation 17 provides a more accurate and comprehensive assessment of the classification capability of a candidate feature.

Example 2

Let Table 2 be a decision table, in which Inline graphic , , , . Then the calculation example is as follows.

From Definition 6, the fuzzy similarity relation matrix can be obtained as follows:
For , the degree of intra-class similarity is: =1.422, =1.416, =1.265. Therefore, from Equation 8, we can get , and similarly , . From Equation 9, =0.702 can be obtained, and similarly, =0.198, =0.100. The fuzzy decision matrix is shown as follows:
From Equation 12 and Equation 13, the fuzzy lower approximation matrix and fuzzy upper approximation matrix can be obtained as:
The fuzzy positive region and the inner product matrix are:

,

. Thus, the inner product =4.384 can be obtained.

In terms of the above analysis, this paper designs a feature selection algorithm based on minimum classification error (MCEFS). The pseudo-code of the algorithm is shown in Algorithm 1.

In Algorithm 1, the parameter Inline graphic is used to terminate the main loop. In fact, the optimal value of varies for different datasets. Assuming the sample size, the number of features, and the number of decision classes are n, m, and c respectively, the first step initializes the feature selection conditions; the third step constructs the fuzzy similarity relation, with a computational complexity of Inline graphic ; the fourth step processes the fuzzy decision, with a computational complexity of ; the fifth step calculates the fuzzy lower approximation, with a computational complexity of ; based on this, the sixth step of the algorithm calculates the inner product correlation under the candidate features according to the fuzzy lower approximation value, with a computational complexity of Inline graphic ; the eighth step of the algorithm calculates the fuzzy rough dependency degree respectively and combines the increment of the fuzzy rough dependency degree with the inner product correlation to evaluate the increment of the dependency degree brought by the new feature; the ninth step selects the feature with the highest increment of dependency degree from the candidate features and adds it to the feature subset. The eleventh step judged the increment of the dependency degree brought about by the new feature. When the difference between after and before addition is greater than the preset parameter Inline graphic , the algorithm continues to loop from step 1 to step 9; otherwise, the algorithm terminates and outputs the final result of the selection of features. Therefore, in the process of finding the optimal feature subset, it may need to be evaluated times. Thus, the total complexity of Algorithm 1 is Inline graphic .

Example 3

Let Table 2 (U, A, D) be a decision table, examples of the feature selection process are as follows: Let Inline graphic , . After the first traversal, for , is calculated as follows: =1.115, =1.104, =1.079, =1.067, =1.063; we can get . After the second traversal, =0.069, =0.174, =0.168, =0.025; we can get . After the third traversal, =0.155, =0.120, =0.094; we can get . After the forth traversal, =0.139, Inline graphic =0.106; we can get . After the fifth traversal, , the traversal is terminated. Thus, .

This section has first introduced a method for generating fuzzy rough approximation matrices with the aid of matrix operations. Subsequently, an inner product relevance function was constructed in the domain and its properties are analyzed. Finally, a static feature selection algorithm is proposed which takes into account the minimum classification error.

Incremental method based on block matrix

To effectively address the challenges of dynamic data environments, this section proposes an incremental feature selection method based on a block matrix framework, developed through an in-depth analysis of fuzzy rough sets. By constructing a block matrix model, the proposed method enables efficient handling of dynamic data and facilitates rapid updating of the results of feature selection. The framework of the incremental method is shown in Fig. 5.

Fig. 5 — Dynamic algorithm framework diagram.

Theorem 4

Let (U, A, D) be a decision table, Inline graphic , , and the new samples , be the domain after the samples are increased. If the fuzzy similarity relation matrix is updated to , then the update method would be as follows:

Proof

According to the idea of a block matrix, the matrix Inline graphic after increasing the sample can be regarded as a matrix composed of four block matrices, that is, . When , the block matrix represents the fuzzy similarity relation matrix before the sample update; the block matrix represents when , , ; the block matrix represents when , , ; the block matrix Inline graphic represents when , , . Based on the above analysis, quod erat demonstrandum.

Theorem 5

Let (U, A, D) be a decision table, Inline graphic , , , , be the center sample of class , , FD is the fuzzy decision corresponding to U/D, , if the fuzzy decision matrix is updated to , then the update method would be as follows:

Proof

Let Inline graphic , , when , show that the class center sample remains unchanged. Now, the fuzzy decision matrix can be regarded as a matrix composed of two block matrices, namely . When , the block matrix , that is, the original fuzzy decision matrix remains unchanged; when , the block matrix , that is, only the fuzzy decision of the newly added samples needs to be calculated. But when Inline graphic , it shows that the center sample of the class changes, and now . That is, fuzzy decision-making requires recalculations. Quod erat demonstrandum.

From the above theorem, it can be observed that when the number of samples increases, the update of the fuzzy decision matrix can be described by the block matrix.

Theorem 6

Let Inline graphic , , FD be the fuzzy decision corresponding to U/D, for , the following equation holds.

Proof

If Inline graphic , , , then from Equation 20, the fuzzy approximation of is equivalent to the union of fuzzy approximations of X and Y.

According to the analysis above, if the matrix Inline graphic can be obtained from the block matrix of the matrix , and the block matrix of the matrix , it is noticeable that the matrix is the fuzzy lower approximation obtained by the fuzzy relation of the new sample and the fuzzy decision of the new sample. Meanwhile, it is of the same type of Inline graphic .

Theorem 7

Let (U, A, D) be a decision table, Inline graphic , , be the center sample of class , , . If , the fuzzy lower approximation matrix is updated to , then the update method would be as follows:

Proof

Suppose that the class center sample remains, then the fuzzy lower approximation matrix Inline graphic after the sample has been increased can be regarded as a matrix composed of two block matrices, that is, . Therefore, when , if , then , can be obtained by Theorem 6. If , can be obtained, that is, . When , . Quod erat demonstrandum.

Clearly, from Theorem 7, when the number of samples increases, if Inline graphic remains unchanged, the incremental method of the block matrix can effectively improve the update efficiency of the fuzzy lower approximation matrix. Again, it must be noted that the precondition of the incremental method is that remains unchanged.

Example 4

Let Table 4 be a decision table with increasing samples, in which Inline graphic , . If keeps unchanged, the calculation example is as follows:

The fuzzy similarity matrix and the fuzzy decision matrix are updated as follows:
Next, the fuzzy lower approximation matrix is updated. Firstly, is calculated:
Combined with matrix , block matrix . From Equation 20, it could be obtained as the block matrix:

.

Thus, the updated fuzzy lower approximation matrix is:

Table 4.

Incremental sample decision table.

					D
0.09	0.16	0.22	0.29	0.27	1
0.26	0.27	0.16	0.15	0.29	1
0.12	0.06	0.25	0.42	0.32	1
0.34	0.34	0.66	0.24	0.36	2
0.46	0.42	0.48	0.67	0.45	2
0.58	0.56	0.64	0.61	0.49	2
0.81	0.69	0.75	0.67	0.54	3
0.61	0.76	0.88	0.71	0.55	3
0.79	0.86	0.74	0.82	0.61	3
0.04	0.11	0.18	0.21	0.26	1
0.39	0.41	0.55	0.72	0.60	2

Open in a new tab

In light of the examples above, we see clearly that the incremental update of the fuzzy lower approximation matrix can be realized by combining the block matrix. As a result, this paper proposes an incremental feature selection based on the block matrix Algorithm 2 (BM-MCEFS).

In Algorithm 2, the computational complexity of updating the fuzzy relation matrix in Step 4 is Inline graphic , that of updating the fuzzy decision matrix in Step 6 is , and the complexity of computing the fuzzy lower approximation matrix in Step 7 is . The calculation of relevance based on the inner-product in Step 12 has complexity . Since finding the optimal feature subset may require evaluations, the overall complexity of Algorithm 2 is Inline graphic .

It is self-evident that, when the class centers remain stable, it is only necessary to calculate the fuzzy relationships and fuzzy decisions of the newly added samples. This can effectively avoid a large number of redundant calculations and enable the algorithm to achieve optimal performance. Conversely, if the sample centers of the class change frequently, it can be seen from Equation 9 that the fuzzy decisions will also change accordingly, and in this case the algorithm will not be able to achieve the best performance.

Experimental results and analysis

Against three representative methods: the acceleration algorithm based on fuzzy rough set information entropy⁴⁶ (AFFS), the heuristic algorithm with variable distance parameters based on distance measures²⁸ (FRDM), the fuzzy rough set feature selection method based on relative distance³³ (MPRB), Incremental feature selection with fuzzy rough sets for dynamic data sets⁵⁴ (IRS) and incremental feature selection based on fuzzy rough sets for hierarchical classification⁵⁵ (ASIRA).

The experimental environment is configured as follows: an Acer A850 equipped with a 12th-generation Intel Core i7-12700 CPU (2.10 GHz) running Windows 11. All algorithms are implemented in Python. The evaluation criteria include four main aspects: algorithm robustness, computation time, reduct size (i.e., number of selected features), and classification accuracy. To reduce experimental error, each data set is tested 10 times and the final results are reported as average values.

A total of 12 datasets, obtained from the UCI Machine Learning Repository , are used in the experiments and summarized in Table 5. Two widely-used classifiers, namely Support Vector Machine (SVM) and K-Nearest Neighbors (KNN), are employed to evaluate the performance of the feature selection algorithms. Before the experiments, all datasets (see Table 5) were normalized using the following preprocessing formula:

In which Inline graphic represents the normalized information of sample x under feature a.

Table 5.

Description of experimental data.

NO	Data set	Abbreviation	Sample	Features	Classes
1	MAGIC Gamma Telescope	MGT	19020	10	2
2	Room Occupancy Estimation	ROE	10129	18	4
3	Shill Bidding	SHB	6321	9	2
4	Wine Quality	WQ	4897	11	3
5	Iranian Churn	IC	3150	13	2
6	Hepatitis C Virus for Egyptian patients	HCE	1385	28	4
7	Turkish Music Emotion	TME	400	50	4
8	TUANDROMD	TR	364	241	2
9	Semeion Handwritten Digit	SHD	284	256	2
10	Glass	GL	214	10	7
11	Connectionist Bench	CB	208	60	2
12	Iris	IR	150	4	7

Open in a new tab

Markelle Kelly, Rachel Longjohn, Kolby Nottingham, The UCI Machine Learning Repository, https://archive.ics.uci.edu

Inner product relevance analysis

To analyze the relationship among inner product relevance (IPR), fuzzy rough dependency (FRD), and classification accuracy of the MCEFS algorithm, this paper randomly generates 12 feature subsets from the CB and TME datasets. The IPR, FRD and classification accuracy of each feature subset are calculated using a ten-fold cross-validation on KNN (K=3) and SVM. The results are presented in Tables 6 and 7.

Table 6.

IPR, FRD, and classification accuracy for the CB dataset.

NO	Feature Subset	FRD	IPR	SVM	KNN
1	21, 17, 34, 43, 60	0.4545	0.1882	0.6431	0.7337
2	26, 9, 45, 7, 30	0.4546	0.1965	0.6672	0.6674
3	28, 33, 24, 56, 58	0.4560	0.1943	0.6044	0.5937
4	36, 39, 35, 25, 10	0.4571	0.1912	0.7201	0.7203
5	40, 38, 21, 14, 54	0.4619	0.1908	0.5943	0.6718
6	14, 7, 55, 20, 32	0.4618	0.1878	0.6382	0.7000
7	18, 41, 35, 16, 2	0.4672	0.2147	0.6189	0.5891
8	55, 19, 14, 22, 13	0.4672	0.1961	0.6675	0.6334
9	44, 11, 30, 21,18	0.4686	0.1860	0.7932	0.8121
10	58, 42, 38, 42, 30	0.4714	0.2168	0.5704	0.5936
11	31, 23, 18, 41, 35	0.4753	0.1903	0.7046	0.7202
12	20, 31, 36, 44,16	0.4826	0.1872	0.7487	0.7443

Open in a new tab

Table 7.

IPR, FRD, and classification accuracy for the TME dataset.

NO	Feature Subset	FRD	IPR	SVM	KNN
1	10, 18, 11, 13, 50	0.2478	0.0563	0.4401	0.4147
2	36, 28, 1, 32, 16	0.2649	0.0580	0.4983	0.4571
3	29, 4, 44, 38, 11	0.2681	0.0574	0.4668	0.4024
4	2, 5, 1, 49, 4	0.2699	0.0561	0.5234	0.4922
5	33, 17, 9, 24, 50	0.2721	0.0554	0.4884	0.4219
6	8, 37, 30, 35, 44	0.2733	0.0561	0.4672	0.3173
7	17, 6, 44, 39, 23	0.2763	0.0557	0.5184	0.4503
8	18, 5, 24, 38, 43	0.2776	0.0558	0.4900	0.4821
9	49, 2, 45, 39, 26	0.2781	0.0551	0.6613	0.6264
10	25, 19, 48, 47, 33	0.2822	0.0563	0.4751	0.4573
11	36, 13, 43, 33, 14	0.2903	0.0567	0.4572	0.3590
12	44, 43, 24, 35, 46	0.3082	0.0563	0.6364	0.5484

Open in a new tab

From Tables 6 and 7, it can be observed that the minimum classification error (IPR of the inner product function) is related to the classification ability of the feature subsets. For example, in the CB dataset, the 9th group of feature subsets has the lowest IPR value and achieves the highest classification accuracy on both the KNN and SVM classifiers; conversely, the 10th group has a higher IPR value and correspondingly lower classification accuracy on both classifiers. In addition, the fifth and sixth groups have nearly identical FRD values; however, the sixth group has a smaller IPR and higher classification accuracy, indicating that when FRD remains constant, the minimum classification error can identify features that are more valuable for classification performance. Nevertheless, this conclusion does not hold universally. For example, in the TME dataset, although the fifth group has a relatively low IPR value, its classification accuracy is also low on both classifiers. This discrepancy arises because the minimum classification error captures only the information related to misclassification within the feature subset, which represents just one aspect of the overall classification capability. Therefore, relying solely on the minimum classification error cannot comprehensively reflect the classification performance of a feature subset.

Additionally, in this study, the threshold Inline graphic was set to 0. Using MCEFS, a series of feature subsets were successfully generated on the four datasets of TME, CB, WQ and SHB. In the SVM and KNN classifiers, based on the feature importance determined by the MCEFS algorithm, features were gradually added from high to low, and at the same time, features were also gradually added according to the unsorted feature sequence in the original data set (RAW) as a control group. The classification accuracy curves that changed with the increase in the number of features were plotted. The experimental results are illustrated in Figs. 6 and 7. The experimental results show that the MCEFS algorithm can significantly improve the classification accuracy of the data. For example, in Fig. 6, when the MCEFS algorithm obtains the highest classification accuracy on the CB dataset, its performance exceeds the highest classification accuracy of RAW; in the SHB dataset, the highest classification accuracy obtained by the MCEFS algorithm on both classifiers exceeds the classification accuracy of RAW. Therefore, it can be concluded that the MCEFS algorithm improves the accuracy of data classification.

Fig. 6 — Classification accuracy curve with the increase of features (SVM).

Fig. 7 — Classification accuracy curve with the increase of features(KNN).

To verify the sensitivity of the MCEFS algorithm to the parameter Inline graphic , this study conducted experiments on four randomly selected datasets (CB, HCE, TME, and GL) and validated the results of the selection of features using two classifiers, SVM and 3NN. During the threshold selection process, each dataset was tested multiple times to determine the fluctuation range of its threshold. Subsequently, the classifiers were employed to test the accuracy of the feature selection results, thereby identifying the threshold sensitivity interval for each dataset. Finally, within this sensitivity interval, a uniform step size was set to adjust the threshold and test the changes in reduction length, classification accuracy, and stability. The experimental results are shown in Figs. 8 and 9.

Fig. 8 — Parameter sensitivity analysis under the KNN classifier.

Fig. 9 — Parameter sensitivity analysis under the SVM classifier.

As can be seen in Figs. 8 and 9, as the parameter Inline graphic increases, the feature subset selected by the MCEFS algorithm gradually decreases in size, and its classification accuracy on the classifiers also exhibits certain fluctuations. However, within a specific range of values , these fluctuations are effectively controlled at a low level. Taking the Glass data set as an example (see Figs. 8 and 9), when Inline graphic ranges from 0.002 to 0.012, the classification accuracy remains relatively stable despite continuous changes in the parameter. This result indicates that the MCEFS algorithm demonstrates strong robustness during parameter adjustment.

Comparison of static feature selection algorithms

To evaluate the efficiency of feature selection of the MCEFS algorithm, this paper uses KNN and SVM classifiers to assess the optimal classification accuracy achieved by four comparison algorithms in 12 datasets. Based on these results, the optimal subsets of features and their corresponding running times are determined, as summarized in Table 8. Furthermore, the classification accuracy results for the 12 datasets, based on the selected optimal feature subsets, are presented in Fig. 10. In these tables, the values underlined indicate the highest classification accuracy achieved for each dataset.

Table 8.

The reduct size and running time of the four algorithms(number/time).

Data set	RAW	FRDM	AFFS	MPRB	MCEFS
MGT	10	6.5/2441.73	3.7/4148.52	5.6/2704.46	5.2/2862.28
ROE	18	4.8/1512.25	8.8/4545.23	4.1/1155.61	9.3/2408.32
SHB	9	3.4/94.45	5.3/275.62	4.0/188.60	1.0/48.92
WQ	11	5.1/271.88	5.8/938.26	4.7/312.07	5.6/357.77
IC	13	1.0/82.25	8.4/461.91	9.1/192.03	6.2/231.63
HCE	28	14.3/143.75	18.4/358.61	12.5/128.68	6.9/98.03
TME	50	34.2/136.51	29.5/70.67	29.5/70.6765	25.1/149.22
TR	241	10/13.08	124/2362.56	1.9/1.86	64.0/2072.05
SHD	256	6/14.96	28/386.49	7.2/55.85	8.5/21.46
GL	10	6.3/1.29	6.7/3.42	5.0/2.65	6.1/1.468
CB	60	26.5/21.21	37.4/29.58	24.5/18.78	26.1/32.41
IR	4	2.5/0.2329	2.7/0.36	3/2.88	2.3/0.26
Average	21.3	10.05/835.60	23.22/1934.44	9.26/858.51	13.86/1241.76

Open in a new tab

The results of the experiment in Table 8 demonstrate that all four algorithms successfully achieved dimensionality reduction. Among them, the MCEFS algorithm generally selected fewer features than the other methods on most datasets. However, in the ROE dataset, the number of features selected by MCEFS exceeded that of the other three algorithms.

By comparing the results in Table 8 and Fig. 10, it is evident that the MCEFS algorithm consistently outperformed the others in terms of classification accuracy in both KNN and SVM classifiers. Specifically, of the original 18 features, MCEFS effectively selected only 9, achieving a substantial reduction in dimensionality.

In terms of computational time, the MCEFS algorithm performs comparably to other algorithms. However, for datasets with a large number of features (such as the CB dataset), the runtime of the MCEFS algorithm is relatively longer. This is mainly because the comparison algorithms only calculate the lower approximations of the samples, while MCEFS additionally computes the inner product correlation between the samples, thus increasing the computational time. It should be noted that MCEFS has achieved higher classification accuracy in most datasets. For example, on the SHB dataset, the MCEFS algorithm achieved higher classification accuracy even when selecting the smallest feature subset; on the IR dataset, the MCEFS algorithm significantly outperformed the other two algorithms in terms of accuracy. These results indicate that the method based on the minimum classification error can effectively compensate for some of the shortcomings of fuzzy rough dependency and improve the classification accuracy of the data.

To assess the robustness of the MCEFS algorithm, 10% and 20% label noise was introduced into five selected datasets. Then, a feature selection was performed and the optimal classification accuracy was recorded. The experimental results are presented in Tables 9 and 10. The results show that, compared to the other algorithms, MCEFS consistently achieves higher classification accuracy under both noise levels. This performance advantage is primarily attributed to the incorporation of the minimum classification error criterion into the MCEFS framework.

Table 9.

Classification accuracy of noise data at 10% noise level.

Data set	Classifier	RAW	Noised data	AFFS	FRDM	MPRB	MCEFS
HCE	SVM	67.87	45.59	45.59	45.53	47.75	48.35
	KNN	63.03	45.16	45.16	44.44	44.45	45.09
TME	SVM	75.62	75.10	62.06	67.09	68.18	69.81
	KNN	68.33	60.26	49.22	57.78	52.97	59.06
GL	SVM	97.19	85.87	85.87	85.87	78.79	87.25
	KNN	94.83	85.41	85.41	84.48	92.12	86.32
CB	SVM	85.00	77.33	68.64	78.26	83.36	80.67
	KNN	84.52	79.64	58.38	83.02	82.36	82.57
IR	SVM	95.90	93.33	76.62	92.67	91.61	92.67
	KNN	95.24	92.00	70.52	92.00	87.68	94.00
Average	—	82.75	73.97	64.75	73.11	72.94	74.58

Open in a new tab

Table 10.

Classification accuracy of noise data at 20% noise level.

Data set	Classifier	RAW	Noised data	AFFS	FRDM	MPRB	MCEFS
HCE	SVM	67.87	43.87	45.38	44.09	46.06	47.26
	KNN	63.03	45.02	45.38	44.44	47.63	45.23
TME	SVM	75.62	75.87	71.84	66.10	54.55	75.62
	KNN	68.33	59.03	61.03	57.01	47.45	62.56
GL	SVM	97.19	88.64	88.64	89.61	64.70	91.00
	KNN	94.83	86.82	86.82	85.84	77.05	87.27
CB	SVM	85.00	82.55	70.62	79.17	70.45	80.64
	KNN	84.52	81.52	66.17	82.57	63.73	81.57
IR	SVM	95.24	91.90	74.43	91.24	78.21	91.24
	KNN	95.90	89.90	71.05	89.90	67.14	91.29
Average	—	82.75	74.51	68.14	73.00	61.10	75.37

Open in a new tab

To further analyze and compare the statistical performance of the four algorithms, the Friedman test⁵⁶ and the Nemenyi post-hoc test are applied based on the classification accuracy results presented in Fig. 10. These two statistical methods are defined as follows.

Where, n and k respectively represent the number of data sets and the number of algorithms, Inline graphic represent the average ranking order of the algorithm in all algorithms.

Where, Inline graphic represents the degree of significance, and is the critical value at a given point.

If the null hypothesis based on the Friedman test is rejected, the Nemenyi post-hoc test is then employed to assess the significance of pairwise differences between algorithms. Specifically, if the difference in average rankings between two algorithms exceeds the critical distance (CD), their performance is considered significantly different. In the corresponding diagram, significant differences are indicated by the absence of connecting lines between algorithms, whereas non-significant differences are shown by horizontal lines connecting them.

Here, Inline graphic , , if , then the critical value⁵⁶ . The Friedman statistics calculated by SVM and KNN are 7.815 and 3.86 respectively, and are greater than the critical value of 2.569. Based on this, in this paper the null hypothesis of the equivalence of the SVM and KNN models is rejected, indicating that there are significant differences between the algorithms. Next, according to Equation 25, we get CD=1.35. Therefore, the CD diagram of four feature selection algorithms under SVM and KNN is shown in Fig. 11. As can be clearly seen from Fig. 11, The Nemenyi test indicates that, under the SVM and KNN classifiers, the MCEFS algorithm is significantly superior to other algorithms, highlighting its competitiveness.

Fig. 11 — The Nemenyi test results of classification accuracies of the four algorithms.

Comparison of feature selection algorithms under sample increase

In order to test the efficiency of the incremental feature selection algorithm after the sample is added, this section randomly divides each data set into two parts, of which 50% of the sample is used as the original data, and the remaining 50% of the sample is used as the test data set. Each time, 10% of the test sample is randomly added, and until it reaches 50% accumulatively, 10% is added each time, up to 5 times. Based on the feature subset corresponding to the optimal classification accuracy selected by the algorithm, and recording the time consumed by each algorithm in selecting these feature subsets, the computational time of different algorithms can be accurately evaluated. The time required for different algorithms to calculate based on 12 datasets is shown in Fig. 12.

For each subgraph in Fig. 12, the abscissa represents the increase in the proportion of the sample, and the ordinate represents the added calculation time of the six algorithms. In Fig. 12, it is evident that in the 12 data sets, with increasing proportion of samples, the calculation time of all algorithms shows an upward trend, but compared to other algorithms, the calculation time duration of the BM-MCEFS algorithm is the least. At the same time, ten-fold cross-validation is used to test the change in the classification accuracy of the data in the KNN and SVM classifiers after adding samples with different ratio columns in 12 data sets. The experimental results are shown in Tables 11 and 12. Among them, ‘RAW’ represents the accuracy of classification under all features. Since the BM-MCEFS algorithm is an incremental algorithm optimized for the MCEFS algorithm, when the added samples are consistent, the classification accuracy error of the data is small. Therefore, in the testing of incremental algorithms, this paper compares the BM-MCEFS algorithm with the AFFS algorithm, the FRDM algorithm, and the MPRB algorithm and the IRS algorithm and the ASIRA algorithm.

Table 11.

The classification accuracy with a certain ratio of column samples added (SVM).

Data set	Algorithm	10%	20%	30%	40%	50%	Average
IR	BM-MCEFS		97.09±8.90		96.26±14.07
	IRS	97.32±13.24		95.34±12.37	96.25±14.31	96.60±14.23	96.53±12.75
	ASIRA	97.77±13.57	97.06±8.78	95.59±10.99	96.21±14.12	96.32±13.89	96.59±12.44
	AFFS	96.67±10.20	91.19±8.56	90.68±14.93	95.69±20.59	89.90±17.79	92.83±14.41
	FRDM	97.78±8.90	97.09±8.90	95.68±11.87		91.52±8.02	95.67±10.35
	MPRB	97.75±13.33	97.00±8.00	95.68±11.87	96.26±14.05	96.52±13.07	96.64±12.25
	RAW	96.67±14.23	91.18±9.37	90.68±11.87	95.26±14.07	86.67±8.90	92.09±11.69
CB	BM-MCEFS			66.80±35.53	74.74±40.71
	IRS	64.45±38.12	68.44±41.45	67.78±34.67	74.87±40.37	71.14±30.32	69.34±37.21
	ASIRA	64.54±40.10	68.89±41.43	68.43±36.87	74.75±39.43	71.12±30.06	69.55±37.79
	AFFS	64.10±6.49	68.95±5.73	66.69±5.00	62.87±31.72	57.64±22.89	64.05±14.37
	FRDM	58.01±33.85	61.71±49.62	64.45±43.14		67.26±31.61	65.32±37.64
	MPRB	58.72±44.57	60.43±38.73		73.01±33.87	66.31±37.07	65.74±37.21
	RAW	63.91±40.26	59.67±50.11	60.18±38.16	66.58±34.98	69.71±34.73	64.01±39.65
GL	BM-MCEFS			78.14±17.09		79.89±15.17
	IRS	87.91±16.34	87.15±22.56	82.34±18.11	78.01±17.21	79.76±15.42	83.03±18.10
	ASIRA	88.88±19.23	86.87±23.81	79.66±15.21	77.49±18.53		82.56±18.69
	AFFS	88.13±20.99	86.54±23.24	76.41±14.11	77.82±20.28	78.01±12.57	81.38±18.24
	FRDM	88.13±15.85	73.38±15.64	73.46±9.99	73.11±18.10	79.44±15.13	77.50±14.94
	MPRB	71.43±13.26	73.38±15.64		72.08±25.71	77.97±13.59	77.02±17.40
	RAW	88.13±20.99	86.54±23.24	76.41±14.11	77.82±20.28	78.94±13.49	81.57±18.42
TME	BM-MCEFS	73.97±19.74				70.08±9.78
	IRS	73.84±28.34	76.26±14.87	75.06±6.88	77.16±5.98	70.08±9.78	74.48±15.51
	ASIRA		76.44±12.87	74.76±8.59	77.27±8.24	70.12±9.67	74.52±12.69
	AFFS	68.44±18.68	75.17±15.85	72.30±13.38	77.06±14.11		73.71±15.00
	FRDM	61.34±16.72	66.16±8.30	67.61±15.56	66.19±11.88	65.84±14.73	65.43±13.44
	MPRB	69.73±17.73	67.65±8.13	72.60±11.98	66.46±11.41	73.35±12.27	69.96±12.69
	RAW	66.88±11.97	68.40±13.67	74.37±15.23	68.46±12.93	69.10±13.13	69.44±13.39
HCE	BM-MCEFS		49.36±7.85
	IRS	46.54 ±4.92	49.32±7.69	47.16±11.72	46.95±4.26	44.53±8.74	46.90±7.94
	ASIRA	47.03 ±5.47		47.22±10.15	46.23±7.34	44.26±8.48	46.83±8.03
	AFFS	44.72±7.85	44.46±8.17	43.41±6.61	44.34±6.44	44.66±6.00	44.32±7.01
	FRDM	44.42±7.30	44.09±7.68	44.00±5.95	45.44±8.99	43.86±6.22	44.36±7.23
	MPRB	46.45±7.74	43.81±10.52	42.90±6.95	44.11±6.68	43.51±7.74	44.16±8.04
	RAW	44.72±7.85	42.15±5.79	42.73±5.77	44.34±6.44	44.66±6.01	43.72±6.37
IC	BM-MCEFS	88.65±3.54	89.11±3.68
	IRS	87.34±2.51	89.02±3.05	89.19±4.29	89.05±4.22	89.31±2.37	88.78±3.39
	ASIRA		88.75±5.39	89.21±8.21	88.42±4.77	88.67±3.72	88.75±5.42
	AFFS	88.61±3.42	89.12±3.68	89.40±3.35	89.35±3.15	89.25±2.30	89.16±3.12
	FRDM	88.45±3.52	89.10±3.68	89.40±3.35	89.39±3.15	84.28±0.34	88.12±2.81
	MPRB	88.65±3.62		89.40±3.35	89.39±3.15	89.15±2.41	89.14±3.27
	RAW	88.35±3.62	89.10±3.68	89.40±3.35	89.33±3.15	79.55±2.32	87.15±3.22
WQ	BM-MCEFS	49.14±6.44		50.54±6.48		49.41±8.42
	IRS	49.84±5.34	49.23±5.44	51.01±6.51	51.46±7.11	49.41±8.42	50.19±6.66
	ASIRA	49.91±8.72	49.02±6.19	50.51±6.32	50.88±6.45	49.87±8.54	50.04±7.33
	AFFS	45.20±3.52	44.93±3.31	46.31±2.91	47.63±4.60	45.03±0.21	45.82±2.91
	FRDM		49.62±4.83		51.86±7.26	45.07±1.03	49.32±4.91
	MPRB	49.20±6.33	48.70±6.16	50.19±6.52	50.99±7.29		50.05±6.78
	RAW	49.05±6.35	49.32±6.07	50.54±6.48	51.08±7.48	45.03±7.68	49.00±6.81
SHB	BM-MCEFS		98.32±1.17		98.15±1.06
	IRS	98.21±1.44	97.54±1.43	97.65±1.21		97.69±0.68	97.86±1.23
	ASIRA	97.87±2.13		98.21±0.94	98.11±1.21	97.87±1.87	98.10±1.78
	AFFS	98.32±1.39	92.56±2.20	93.48±1.21	93.41±1.77	98.04±0.94	95.16±1.50
	FRDM	98.34±1.30	97.65±0.91	97.60±0.80	97.52±0.80	98.04±0.91	97.83±0.94
	MPRB	97.68±1.15	97.65±0.91	97.60±0.80	97.52±0.80	98.01±0.85	97.69±0.91
	RAW	98.32±1.39	98.32±1.17	98.20±0.99	92.09±0.79	88.04±0.91	94.99±1.05
ROE	BM-MCEFS			98.98±3.36		98.60±2.82
	IRS	98.15±3.48	99.13±2.58		98.37±3.52	98.33±2.48	98.60±15.39
	ASIRA	98.997±2.76	99.05±2.51	98.77±2.71	98.05±2.79		98.71±2.57
	AFFS	97.31±7.00	97.77±5.83	97.53±4.95	95.78±5.46	89.47±8.20	95.57±6.29
	FRDM	86.23±7.29	88.16±6.21	97.14±6.92	89.68±9.42	91.19±7.92	90.48±7.55
	MPRB	86.23±7.29	88.16±6.21	96.96±7.63	90.20±8.49	91.19±7.89	90.55±7.54
	RAW	99.04±3.29	99.17±2.83	98.95±3.35	98.39±3.21	88.58±2.82	96.83±3.10
MGT	BM-MCEFS			86.43±1.11	81.56±0.99	79.16±1.00
	IRS	87.66±1.07	89.16±0.75	86.49±1.21	81.56±0.99	79.19±1.02	84.81±1.02
	ASIRA	87.54±1.28	88.68±1.22	86.51±2.15		79.22±1.31	84.71±1.49
	AFFS	87.66±1.03	89.23±0.76	85.29±0.78	81.53±0.99	78.59±1.51	84.46±1.01
	FRDM	87.61±0.98	89.21±0.85		81.56±0.99	79.06±1.10	84.78±1.04
	MPRB	86.41±0.59	89.21±0.85	85.41±1.17	81.54±1.24		84.64±1.03
	RAW	87.66±1.03	89.21±0.88	86.42±1.10	81.55±1.05	79.16±0.90	84.80±0.99
TR	BM-MCEFS			99.58±2.50		97.86±5.71
	IRS	99.32±3.12	99.38±3.18		99.01±2.51	97.64±4.87	98.99±3.37
	ASIRA	99.16±2.67	99.51±2.54	99.52±2.31	99.10±2.71		99.03±3.48
	AFFS	97.87±3.11	97.12±4.23	98.55±2.61	97.46±4.58	91.03±11.21	96.41±6.02
	FRDM	99.00±4.00	99.09±5.45	98.75±3.82	98.46±3.77	91.57±3.50	97.37±4.17
	MPRB	99.00±4.00	99.09±5.45	98.75±3.82	98.46±3.77	96.57±3.50	98.37±4.17
	RAW	96.50±3.00	96.55±2.73	96.58±2.50	96.23±3.08	96.29±2.86	96.43±2.84
SHD	M-MCEFS		93.68±16.07		91.34±0.48
	IRS	93.12±11.76	93.54±15.44	91.54±15.42	91.34±0.48	97.33±14.87	93.37±12.93
	ASIRA	92.87±13.91		91.09±17.65	91.34±0.48	97.39±16.21	93.28±14.60
	AFFS	90.31±12.03	88.75±23.71	88.00±34.97	90.42±11.25	93.29±16.97	90.15±21.65
	FRDM	91.52±23.07	88.00±34.97	88.00±34.97	94.21±19.57	93.34±16.33	91.01±26.94
	MPRB	91.93±22.91	88.40±35.24	76.64±29.68		91.17±15.73	88.47±25.60
	RAW	91.08±15.19	93.68±16.07	91.67±19.47	92.31±18.43	93.29±16.97	92.41±17.30

Open in a new tab

Table 12.

The classification accuracy with a certain ratio of column samples added (KNN).

Data set	Algorithm	10%	20%	30%	40%	50%	Average
IR	BM-MCEFS		96.09±13.11			96.52±13.07
	IRS	97.45±8.32	95.43±12.67	96.14±2.31	94.52±12.11	94.28±9.87	95.56±9.79
	ASIRA	97.22±6.73		96.47±2.31	94.52±14.27		96.26±10.78
	AFFS	97.68±8.90	71.18±19.12	74.85±19.48	75.44±12.81	78.48±13.42	79.53±14.75
	FRDM	97.58±8.90	96.09±13.11	94.85±11.71	94.67±15.43	95.86±14.28	95.81±12.69
	MPRB	96.67±14.23	95.09±13.29	94.85±11.71	94.67±15.43	94.72±12.13	95.20±13.43
	RAW	97.58±8.89	95.09±13.29	94.85±11.71	94.73±13.78	95.19±12.74	95.49±12.08
CB	BM-MCEFS	65.58±28.18
	IRS		74.32±19.87	66.45±30.13	67.76±32.54	61.47±25.81	67.22±26.90
	ASIRA	65.08±26.41	75.04±22.12	66.41±28.12	68.09±28.91	61.41±26.23	67.21±26.46
	AFFS	57.88±29.77	58.43±29.54	60.70±13.19	56.37±22.96	51.74±31.04	57.02±25.30
	FRDM	64.87±36.52	70.57±37.17	65.33±31.29	64.33±35.70	61.50±32.39	65.32±34.61
	MPRB	63.33±20.26	73.19±33.10	62.35±34.55	64.94±46.51	60.02±32.89	64.77±34.48
	RAW	61.67±21.59	70.48±35.34	62.39±29.37	66.46±32.39	60.00±28.85	64.20±29.51
GL	BM-MCEFS			88.53±16.24
	IRS	91.09±17.62	90.34±18.23		86.54±19.67	88.54±20.21	89.11±18.74
	ASIRA	90.78±15.76	90.78±17.131	88.67±16.45	86.92±21.72	88.72±20.59	89.17±18.48
	AFFS	86.70±22.53	87.08±17.05	85.62±18.65	86.11±14.29	85.50±21.60	86.20±18.82
	FRDM	89.62±13.26	90.25±15.73	85.03±17.20	84.55±13.44	85.97±19.24	87.08±15.77
	MPRB	90.32±13.30	90.25±15.73	83.14±24.82		87.58±26.68	88.62±23.82
	RAW	86.70±22.53	87.08±17.05	85.62±18.65	86.11±14.29	85.97±19.24	86.30±18.35
TME	BM-MCEFS			61.22±14.76	62.52±15.71
	IRS	65.45±21.64	67.23±12.12	63.34±12.14	64.32±15.19	58.11±15.88	63.69±15.78
	ASIRA	65.77±22.12	67.55±11.05	62.65±13.81	63.83±11.09	58.07±14.82	63.57±15.13
	AFFS	65.42±20.17	61.03±21.19	64.16±17.92		57.51±11.01	62.58±16.12
	FRDM	65.61±21.55	59.66±12.70		61.23±13.37	55.31±12.80	61.22±14.76
	MPRB	59.22±17.27	60.75±10.43	61.02±11.41	59.74±13.62	58.79±12.47	60.30±13.25
	RAW	63.84±16.97	64.38±16.27	64.16±16.23	62.11±15.07	52.05±11.62	61.31±15.23
HCE	BM-MCEFS	47.56±7.28			45.04±7.17	43.07±5.20
	IRS		46.23±8.17	48.54±10.11	45.26±7.62	43.59±6.35	46.37±8.23
	ASIRA	47.32±7.41	46.37±8.63	48.68±10.43	45.18±7.04	43.73±5.86	46.26±8.03
	AFFS	45.74±7.54	45.39±7.19	45.19±6.13	44.81±7.47		45.13±7.03
	FRDM	45.13±7.65	44.19±8.03	44.35±9.04		44.29±5.94	44.66±7.11
	MPRB	45.13±7.84	42.81±9.03	43.09±10.23	45.12±7.99	45.33±8.50	44.30±8.76
	RAW	45.74±7.54	43.27±8.16	43.59±8.31	44.81±7.47	41.52±6.78	43.79±7.65
IC	BM-MCEFS	94.97±2.76		95.29±2.67		95.05±2.92
	IRS		95.22±4.22	95.24±2.87	94.88±4.27	94.37±2.45	95.11±3.53
	ASIRA	94.69±2.81	95.31±3.26	95.06±2.45	94.92±2.53	95.01±2.81	95.00±2.79
	AFFS		95.29±3.21	95.29±2.84	93.92±3.20	95.01±2.92	94.92±2.99
	FRDM	94.81±3.08	95.20±3.10		94.98±2.72	84.28±0.33	92.94±2.39
	MPRB	94.76±3.22	95.13±3.03	95.45±2.73	94.98±2.72		95.12±2.86
	RAW	94.92±2.83	95.33±3.01	95.37±2.70	94.87±3.00	82.24±2.90	92.55±2.89
WQ	BM-MCEFS		47.06±7.17		47.58±8.17	45.77±7.84
	IRS	45.19±5.32		47.05±5.32	47.64±8.47	45.87±7.56	46.59±7.05
	ASIRA	45.29±5.72	47.01±7.57	47.01±4.92	47.97±9.48	45.93±8.63	46.64±7.46
	AFFS	44.95±4.85	47.14±9.98	45.77±5.51	47.50±7.15	42.81±6.21	45.63±6.74
	FRDM	44.68±5.16	46.92±5.27	47.25±5.79		44.52±6.67	46.40±6.56
	MPRB	45.23±5.29	45.91±5.01	46.09±5.42	48.17±8.92		46.32±7.36
	RAW	45.31±3.08	46.29±4.74	47.10±5.46	47.46±11.59	48.85±10.82	47.00±7.14
SHB	BM-MCEFS					98.86±0.91
	IRS	99.08±1.21	99.12±0.38	99.21±1.01	98.87±0.69		99.04±0.89
	ASIRA	99.10±1.11	99.07±1.02	99.17±0.97	99.04±0.59	98.84±0.88	98.52±3.02
	AFFS	99.01±1.05	92.54±2.40	92.46±2.31	92.43±2.51	97.04±1.10	94.70±1.87
	FRDM	98.34±1.47	97.34±1.44	97.60±0.80	97.52±0.80	97.83±1.10	97.73±1.12
	MPRB	97.68±1.15	97.34±1.44	97.60±0.80	97.52±0.80	97.97±0.93	97.62±1.05
	RAW	98.11±1.05	99.07±1.01	99.14±0.88	98.13±1.10	88.92±1.03	96.67±1.01
ROE	BM-MCEFS			98.46±3.22
	IRS	99.19±2.21	99.38±1.54	98.23±2.76	98.21±3.48	97.65±5.43	98.53±3.36
	ASIRA	99.16±2.24	99.31±1.41		98.09±3.56	97.59±3.98	98.52±3.02
	AFFS	90.17±26.88	93.73±17.57	92.48±17.49	94.97±13.75	89.22±11.20	92.11±17.38
	FRDM	93.60±6.71	94.50±4.46	96.20±7.87	95.25±9.17	97.47±5.80	95.44±6.21
	MPRB	93.60±6.71	94.50±4.46	97.16±5.55	97.19±6.12	97.45±5.75	95.98±5.77
	RAW	98.87±3.17	99.05±2.62	98.38±4.04	98.15±3.70	87.99±4.20	96.07±3.51
MGT	BM-MCEFS			89.18±1.21		83.89±1.61
	IRS	89.87±2.56	91.37±1.67		85.19±1.37	83.89±1.61	87.91±1.84
	ASIRA	90.13±2.76	91.29±1.76	89.24±1.65	85.17±1.12	83.77±1.25	87.92±1.80
	AFFS	90.16±1.38	91.21±1.23	86.77±1.15	85.22±1.26	80.49±1.62	86.77±1.33
	FRDM	90.10±1.33	91.38±1.12	89.14±1.37	85.23±1.23		87.96±1.45
	MPRB	89.24±1.36	91.38±1.12	89.14±1.37	85.42±1.58	82.41±1.27	87.52±1.35
	RAW	90.16±1.38	91.21±1.23	88.89±1.46	85.17±1.25	81.77±1.61	87.44±1.39
TR	BM-MCEFS			99.89±0.12		95.34±7.85
	IRS	99.47±3.34	98.84±3.47	99.86±0.17	99.58±2.78	95.19±6.45	98.59±3.81
	ASIRA	99.42±2.89	99.02±3.51		99.45±2.17	95.31±7.34	98.62±3.98
	AFFS	96.12±2.74	98.73±3.71	92.64±6.91	92.11±3.27	88.89±5.93	93.70±4.79
	FRDM	99.00±4.00	99.09±5.45	90.38±7.48	88.82±2.12	79.62±1.98	91.38±4.69
	MPRB	99.00±4.00	99.09±5.45	90.38±7.48	88.82±2.12	86.62±1.98	92.78±4.69
	RAW	99.50±3.00	99.09±3.64	99.17±3.33	99.23±3.08		98.66±3.19
SHD	BM-MCEFS		84.18±0.56		85.34±0.48
	IRS	90.29±11.26		84.66±0.38	85.32±0.45	88.07±11.76	86.51±7.29
	ASIRA	90.35±12.02	84.18±0.56	84.74±0.72	85.31±0.34	88.11±11.26	86.54±7.38
	AFFS	84.21±1.82	81.45±3.37	82.78±8.67		87.62±4.77	84.65±5.03
	FRDM	83.53±0.61	83.78±2.58	84.40±2.47	84.97±2.38	85.50±2.28	84.44±2.19
	MPRB	83.53±0.61	82.58±9.74	84.79±0.52	84.97±2.38	86.93±9.00	84.56±6.04
	RAW	80.13±17.52	81.34±9.77	81.67±8.12	81.20±7.55	81.17±7.28	81.10±10.75

Open in a new tab

It is not difficult to see from Tables 11 and 12 that under the KNN classifier, the average accuracy of the BM-MCEFS algorithm on 12 data sets is better than other algorithms. At the same time, under the SVM classifier, the classification accuracy of the BM-MCEFS algorithm is significantly higher than that of other algorithms. For example, on the HCE dataset in Table 11, the average accuracy of the BM-MCEFS algorithm is 47.10, while the average accuracy of the AFFS algorithm is 44.32, while the FRDM algorithm being 44.36 and the MPRB 44.16.

To evaluate the robustness of the BM-MCEFS algorithm, 10% label noise was introduced into a portion of sequentially added samples in four selected datasets. The feature selection was then performed under these noisy conditions. In the experiment, ”RAW” denotes the classification accuracy using the original feature set, while ”Noised data” refers to the classification accuracy obtained after injecting 10% of label noise into the full feature set. The experimental results are presented in Figs. 13 and 14. As observed, the BM-MCEFS algorithm consistently achieves higher classification accuracy across most noise levels. This performance is primarily attributed to the fact that BM-MCEFS inherits the minimum classification error criterion from the MCEFS algorithm.

Fig. 13 — The classification accuracy with noise added into a certain ratio column sample(SVM).

Fig. 14 — The classification accuracy with noise added into a certain ratio column sample(KNN).

To further assess the statistical performance of the five algorithms, a comparative analysis was conducted based on computation time, as shown in Fig. 12, under a condition where 50% of the samples were incrementally added. The resulting Critical Difference (CD) diagram is also provided.

In this analysis, Inline graphic , , if , then the critical value⁵⁶ . The Friedman statistic is 46.64, which exceeds the critical value, leading to the rejection of the null hypothesis of equivalence. This indicates significant differences among the algorithms. According to Equation 25, CD = 2.60. As shown in Fig. 15, the Nemenyi test confirms that the BM-MCEFS algorithm performs significantly better than the others, further highlighting its competitiveness.

Fig. 15 — The Nemenyi test results of the computational times for the five algorithms.

Conclusions

Feature selection is an effective approach to high-dimensional data analysis, as it reduces data redundancy while preserving essential discriminative information. Incremental learning further enhances this process by leveraging prior knowledge to efficiently adapt to dynamically evolving data environments. This paper investigates feature selection methods based on fuzzy rough sets and identifies a key limitation of the classical fuzzy positive region: its failure to fully exploit the rich membership information embedded in the fuzzy lower approximation. To address this issue, we propose a novel Minimum Classification Error-based Feature Selection framework (MCEFS). The method constructs continuous membership curves over the universe of discourse and quantifies inter-class separability using inner product correlation, thereby effectively uncovering discriminative information beyond the traditional fuzzy positive region. Moreover, by integrating efficient matrix computation strategies, the generation of the fuzzy lower approximation is significantly accelerated, substantially improving the computational efficiency of static feature selection. Building on this foundation, we further develop an incremental variant–BM-MCEFS–that employs a block matrix mechanism to dynamically update both the fuzzy relation matrix and the fuzzy decision matrix. By reusing and incrementally refining sub-blocks of these matrices, the algorithm avoids full recomputation during data updates, greatly reducing time overhead in dynamic scenarios. Experimental results on 12 benchmark datasets demonstrate that both MCEFS and BM-MCEFS achieve high classification accuracy while offering markedly superior computational efficiency compared to state-of-the-art methods. The proposed algorithms hold significant practical value in real-world applications involving highly dimensional, streaming, or frequently updated, streaming data. For example: In smart urban management⁵⁷, they can dynamically identify key indicators–such as traffic flow, environmental quality, and land use–to support resilient city planning; in agricultural cooperation systems⁵⁸, they enable effective selection of environmental and socio-economic features, facilitating multidimensional sustainability assessments that go beyond yield alone; in industrial production optimization⁵⁹, they support real-time monitoring and feature-driven anomaly detection, thereby improving resource utilization and system stability.These capabilities align closely with the current social demands for sustainable development, digital transformation, and intelligent decision-making.

Nevertheless, the proposed method has certain limitations. Its effectiveness relies on the relative stability of the underlying data distribution, particularly the continuity of class-center sample structures. When class centers undergo abrupt shifts due to concept drift, the incremental update mechanism may lag behind. Additionally, although inner product correlation significantly strengthens feature discriminability, it also introduces additional computational overhead. In future work, we will focus on three main directions: designing block-update strategies that respond to feature-level changes to better capture local dynamics; extending the use of inner product correlation to broader supervised learning tasks, such as multi-label and imbalanced learning; integrating incremental and batch processing mechanisms to enhance robustness against concept drift. Our ultimate goal is to develop a more scalable, adaptive, and interpretable feature selection framework that can be effectively deployed across diverse real-world applications.

Author contributions

Z.W.C.: Conceptualization, Problem formulation, Methodology, Writing – original draft, Review and editing, Final draft. M.G.X.: Numerical analysis, Programming, Mathematical modeling. J.L.: Supervision, Problem formulation, Programming, Writing – review and editing, Final drafting.

Funding

This study was funded by the National Natural Science Foundation of China (No.62066044), the 2025 Autonomous Region Graduate Education Innovation Plan Project (No.XJ2025G209) and the Xinjiang Normal University Smart Education Engineering Technology Research Center Project (No.XJNU-ZHJY202410).

Data availability

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Yin, Z., Liu, L., Chen, J., Zhao, B. & Wang, Y. Locally robust eeg feature selection for individual-independent emotion recognition. Expert Syst. Appl.162, 113768 (2020). [Google Scholar]
2.Sheikhpour, R., Saberi-Movahed, F., Jalili, M. & Berahmand, K. Semi-supervised feature selection with concept factorization and robust label learning. Pattern Recognit. 112317 (2025).
3.Kou, G. et al. Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Appl. Soft Comput.86, 105836 (2020). [Google Scholar]
4.Sheikhpour, R., Mohammadi, M., Berahmand, K., Saberi-Movahed, F. & Khosravi, H. Robust semi-supervised multi-label feature selection based on shared subspace and manifold learning. Inf. Sci.699, 121800 (2025). [Google Scholar]
5.Fedorka P, Buchuk R, Klymenko M, Saibert F, Petrushyn A. The use of adaptive artificial intelligence (ai) learning models in decision support systems for smart regions. Journal of Research, Innovation and Technologies4, 99–115 (2025). [Google Scholar]
6.Nejadshamsi, S., Bentahar, J., Eicker, U., Wang, C. & Jamshidi, F. A geographic-semantic context-aware urban commuting flow prediction model using graph neural network. Expert Syst. Appl.261, 125534 (2025). [Google Scholar]
7.Qiu, Y., Bouraima, M., Badi, I., Stević, Ž & Simic, V. A decision-making model for prioritizing low-carbon policies in climate change mitigation. Chall. sustain12, 1–17 (2024). [Google Scholar]
8.Chaoui, G., Yaagoubi, R. & Mastere, M. Integrating geospatial technologies and multi-criteria decision analysis for sustainable and resilient urban planning. Chall. Sustain13, 122–134 (2025). [Google Scholar]
9.Fedorka, P., Buchuk, R., Klymenko, M., Saibert, F. & Petrushyn, A. The use of adaptive artificial intelligence (ai) learning models in decision support systems for smart regions. Journal of Research, Innovation and Technologies7, 99–115 (2025). [Google Scholar]
10.Krause, A. & Köppel, J. A multi-criteria approach for assessing the sustainability of small-scale cooking and sanitation technologies. Challenges in Sustainability6, 1–19 (2018). [Google Scholar]
11.Qiu, Y., Bouraima, M., Badi, I., Stević, Ž & Simic, V. A decision-making model for prioritizing low-carbon policies in climate change mitigation. Chall. sustain12, 1–17 (2024). [Google Scholar]
12.Chaoui, G., Yaagoubi, R. & Mastere, M. Integrating geospatial technologies and multi-criteria decision analysis for sustainable and resilient urban planning. Challenges in Sustainability13, 122–134 (2025). [Google Scholar]
13.Terentieva, K., Karpenko, I., Yarova, T., Shkvyria, O. & Pasko, Y. Technological innovation in digital brand management: Leveraging artificial intelligence and immersive experiences. Journal of Research, Innovation and Technologies4, 201–223 (2025). [Google Scholar]
14.Wolf, B. M., Häring, A.-M. & Heß, J. Strategies towards evaluation beyond scientific impact: Pathways not only for agricultural research. Organic Farming1, 3–18 (2015). [Google Scholar]
15.Dubois, D. & Prade, H. Rough fuzzy sets and fuzzy rough sets. International Journal of General System17, 191–209 (1990). [Google Scholar]
16.Lang, G., Li, Q., Cai, M., Yang, T. & Xiao, Q. Incremental approaches to knowledge reduction based on characteristic matrices. Int. J. Mach. Learn. Cybern.8, 203–222 (2017). [Google Scholar]
17.Liang, J., Wang, F., Dang, C. & Qian, Y. A group incremental approach to feature selection applying rough set technique. IEEE Trans. Knowl. Data Eng.26, 294–308 (2012). [Google Scholar]
18.Sun, L., Zhang, X., Qian, Y., Xu, J. & Zhang, S. Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification. Inf. Sci.502, 18–41 (2019). [Google Scholar]
19.Feng, Y., Hua, Z. & Liu, G. Partial reduction algorithms for fuzzy relation systems. Knowl.-Based Syst.188, 105047 (2020). [Google Scholar]
20.Theerens, A. & Cornelis, C. Fuzzy rough sets based on fuzzy quantification. Fuzzy Sets Syst.473, 108704 (2023). [Google Scholar]
21.Alnoor, A. et al. Toward a sustainable transportation industry: Oil company benchmarking based on the extension of linear diophantine fuzzy rough sets and multicriteria decision-making methods. IEEE Trans. Fuzzy Syst.31, 449–459 (2022). [Google Scholar]
22.Riaz, M. & Hashmi, M. R. Linear diophantine fuzzy set and its applications towards multi-attribute decision-making problems. Journal of Intelligent & Fuzzy Systems37, 5417–5439 (2019). [Google Scholar]
23.Yang, X., Chen, H., Li, T. & Luo, C. A noise-aware fuzzy rough set approach for feature selection. Knowl.-Based Syst.250, 109092 (2022). [Google Scholar]
24.Ye, J., Zhan, J. & Xu, Z. A novel multi-attribute decision-making method based on fuzzy rough sets. Computers & Industrial Engineering155, 107136 (2021). [Google Scholar]
25.He, J. et al. Attribute reduction in an incomplete categorical decision information system based on fuzzy rough sets. Artif. Intell. Rev.55, 5313–5348 (2022). [Google Scholar]
26.Zhang, K. & Dai, J. Redefined fuzzy rough set models in fuzzy covering group approximation spaces. Fuzzy Sets Syst.442, 109–154 (2022). [Google Scholar]
27.Deng, Z. et al. Feature selection for label distribution learning based on neighborhood fuzzy rough sets. Appl. Soft Comput.169, 112542 (2025). [Google Scholar]
28.Wang, C., Huang, Y., Shao, M. & Fan, X. Fuzzy rough set-based attribute reduction using distance measures. Knowl.-Based Syst.164, 205–212 (2019). [Google Scholar]
29.Zhang, X., Mei, C., Chen, D. & Li, J. Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy. Pattern Recogn.56, 1–15 (2016). [Google Scholar]
30.Qian, W., Huang, J., Wang, Y. & Shu, W. Mutual information-based label distribution feature selection for multi-label learning. Knowl.-Based Syst.195, 105684 (2020). [Google Scholar]
31.Qiu, Z. & Zhao, H. A fuzzy rough set approach to hierarchical feature selection based on hausdorff distance. Appl. Intell.52, 11089–11102 (2022). [Google Scholar]
32.Sun, Y. & Zhu, P. Online group streaming feature selection based on fuzzy neighborhood granular ball rough sets. Expert Syst. Appl.249, 123778 (2024). [Google Scholar]
33.An, S. et al. Relative fuzzy rough approximations for feature selection and classification. IEEE Transactions on Cybernetics53, 2200–2210 (2023). [DOI] [PubMed] [Google Scholar]
34.Liang, P., Lei, D., Chin, K. & Hu, J. Feature selection based on robust fuzzy rough sets using kernel-based similarity and relative classification uncertainty measures. Knowl.-Based Syst.255, 109795 (2022). [Google Scholar]
35.Zhang, Y., Wang, C., Huang, Y., Ding, W. & Qian, Y. Adaptive relative fuzzy rough learning for classification. IEEE Trans. Fuzzy Syst.32, 6267–6276 (2024). [Google Scholar]
36.Chen, X., Lai, L. & Luo, M. A novel fusion and feature selection framework for multisource time-series data based on information entropy. IEEE Trans. Neural Netw. Learn. Syst. (2025). [DOI] [PubMed]
37.Wang, C., Qian, Y., Ding, W. & Fan, X. Feature selection with fuzzy-rough minimum classification error criterion. IEEE Trans. Fuzzy Syst.30, 2930–2942 (2021). [Google Scholar]
38.Xu, W. & Bu, Q. Matrix-based incremental feature selection method using weight-partitioned multigranulation rough set. Inf. Sci.681, 121219 (2024). [Google Scholar]
39.Zhao, J. et al. Consistency approximation: Incremental feature selection based on fuzzy rough set theory. Pattern Recogn.155, 110652 (2024). [Google Scholar]
40.Wang, T., Sun, B. & Jiang, C. Kernelized multi-granulation fuzzy rough set over hybrid attribute decision system and application to stroke risk prediction. Appl. Intell.53, 24876–24894 (2023). [Google Scholar]
41.Sang, B. et al. Feature selection for dynamic interval-valued ordered data based on fuzzy dominance neighborhood rough set. Knowl.-Based Syst.227, 107223 (2021). [Google Scholar]
42.Yu, J. & Xu, W. Incremental knowledge discovering in interval-valued decision information system with the dynamic data. Int. J. Mach. Learn. Cybern.8, 849–864 (2017). [Google Scholar]
43.Xu, W., Yuan, K. & Li, W. Dynamic updating approximations of local generalized multigranulation neighborhood rough set. Appl. Intell.52, 9148–9173 (2022). [Google Scholar]
44.Sang, B., Chen, H., Yang, L., Li, T. & Xu, W. Incremental feature selection using a conditional entropy based on fuzzy dominance neighborhood rough sets. IEEE Trans. Fuzzy Syst.30, 1683–1697 (2021). [Google Scholar]
45.Wang, L., Pei, Z., Qin, K. & Yang, L. Incremental updating fuzzy tolerance rough set approach in intuitionistic fuzzy information systems with fuzzy decision. Appl. Soft Comput.151, 111119 (2024). [Google Scholar]
46.Zhang, X., Liu, X. & Yang, Y. A fast feature selection algorithm by accelerating computation of fuzzy rough set-based information entropy. Entropy20, 788 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Zhang, X. & Li, J. Incremental feature selection approach to interval-valued fuzzy decision information systems based on -fuzzy similarity self-information. Inf. Sci.625, 593–619 (2023). [Google Scholar]
48.Zhao, J., Ling, Y., Huang, F., Wang, J. & See-To, E. W. Incremental feature selection for dynamic incomplete data using sub-tolerance relations. Pattern Recogn.148, 110125 (2024). [Google Scholar]
49.Qi, Z., Li, H., Liu, F., Chen, T. & Dai, J. Fusion decision strategies for multiple criterion preferences based on three-way decision. Information Fusion108, 102356 (2024). [Google Scholar]
50.Xu, W., Yuan, Z. & Liu, Z. Feature selection for unbalanced distribution hybrid data based on k-nearest neighborhood rough set. IEEE Transactions on Artificial Intelligence5, 229–243 (2023). [Google Scholar]
51.Gao, Y., Chen, D., Wang, H. & Shi, R. Optimization attribute reduction with fuzzy rough sets based on algorithm stability. IEEE Trans. Fuzzy Syst. (2023).
52.Sang, B., Xu, W., Chen, H. & Li, T. Active antinoise fuzzy dominance rough feature selection using adaptive k-nearest neighbors. IEEE Trans. Fuzzy Syst.31, 3944–3958 (2023). [Google Scholar]
53.Chen, J., Lin, Y., Mi, J., Li, S. & Ding, W. A spectral feature selection approach with kernelized fuzzy rough sets. IEEE Trans. Fuzzy Syst.30, 2886–2901 (2021). [Google Scholar]
54.Dong, L., Wang, R. & Chen, D. Incremental feature selection with fuzzy rough sets for dynamic data sets. Fuzzy Sets Syst.467, 108503 (2023). [Google Scholar]
55.Huang, W., She, Y., He, X. & Ding, W. Fuzzy rough sets-based incremental feature selection for hierarchical classification. IEEE Trans. Fuzzy Syst.31, 3721–3733 (2023). [Google Scholar]
56.Demšar, J. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res.7, 1–30 (2006). [Google Scholar]
57.Ulker Senkulak, B., Kanoglu, A. & Ozcevik, O. Simurg_cities conceptual model: Multi-dimensional and multi-layer performance-based assessment of urban sustainability at the city level. Chall. Sustain13, 425–444 (2025). [Google Scholar]
58.Utomo, B., Soedarto, T., Winarno, S. & Hendrarini, H. Predicting the success of coffee farmer partnerships using factor analysis and multiple linear regression. Org. Farming11, 61–71 (2025). [Google Scholar]
59.Baidalina, S. et al. Enhancing nutritional value and production efficiency of feeds through biochemical composition optimization. Org. Farming10, 80–93 (2024). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

[CR1] 1.Yin, Z., Liu, L., Chen, J., Zhao, B. & Wang, Y. Locally robust eeg feature selection for individual-independent emotion recognition. Expert Syst. Appl.162, 113768 (2020). [Google Scholar]

[CR2] 2.Sheikhpour, R., Saberi-Movahed, F., Jalili, M. & Berahmand, K. Semi-supervised feature selection with concept factorization and robust label learning. Pattern Recognit. 112317 (2025).

[CR3] 3.Kou, G. et al. Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Appl. Soft Comput.86, 105836 (2020). [Google Scholar]

[CR4] 4.Sheikhpour, R., Mohammadi, M., Berahmand, K., Saberi-Movahed, F. & Khosravi, H. Robust semi-supervised multi-label feature selection based on shared subspace and manifold learning. Inf. Sci.699, 121800 (2025). [Google Scholar]

[CR5] 5.Fedorka P, Buchuk R, Klymenko M, Saibert F, Petrushyn A. The use of adaptive artificial intelligence (ai) learning models in decision support systems for smart regions. Journal of Research, Innovation and Technologies4, 99–115 (2025). [Google Scholar]

[CR6] 6.Nejadshamsi, S., Bentahar, J., Eicker, U., Wang, C. & Jamshidi, F. A geographic-semantic context-aware urban commuting flow prediction model using graph neural network. Expert Syst. Appl.261, 125534 (2025). [Google Scholar]

[CR7] 7.Qiu, Y., Bouraima, M., Badi, I., Stević, Ž & Simic, V. A decision-making model for prioritizing low-carbon policies in climate change mitigation. Chall. sustain12, 1–17 (2024). [Google Scholar]

[CR8] 8.Chaoui, G., Yaagoubi, R. & Mastere, M. Integrating geospatial technologies and multi-criteria decision analysis for sustainable and resilient urban planning. Chall. Sustain13, 122–134 (2025). [Google Scholar]

[CR9] 9.Fedorka, P., Buchuk, R., Klymenko, M., Saibert, F. & Petrushyn, A. The use of adaptive artificial intelligence (ai) learning models in decision support systems for smart regions. Journal of Research, Innovation and Technologies7, 99–115 (2025). [Google Scholar]

[CR10] 10.Krause, A. & Köppel, J. A multi-criteria approach for assessing the sustainability of small-scale cooking and sanitation technologies. Challenges in Sustainability6, 1–19 (2018). [Google Scholar]

[CR11] 11.Qiu, Y., Bouraima, M., Badi, I., Stević, Ž & Simic, V. A decision-making model for prioritizing low-carbon policies in climate change mitigation. Chall. sustain12, 1–17 (2024). [Google Scholar]

[CR12] 12.Chaoui, G., Yaagoubi, R. & Mastere, M. Integrating geospatial technologies and multi-criteria decision analysis for sustainable and resilient urban planning. Challenges in Sustainability13, 122–134 (2025). [Google Scholar]

[CR13] 13.Terentieva, K., Karpenko, I., Yarova, T., Shkvyria, O. & Pasko, Y. Technological innovation in digital brand management: Leveraging artificial intelligence and immersive experiences. Journal of Research, Innovation and Technologies4, 201–223 (2025). [Google Scholar]

[CR14] 14.Wolf, B. M., Häring, A.-M. & Heß, J. Strategies towards evaluation beyond scientific impact: Pathways not only for agricultural research. Organic Farming1, 3–18 (2015). [Google Scholar]

[CR15] 15.Dubois, D. & Prade, H. Rough fuzzy sets and fuzzy rough sets. International Journal of General System17, 191–209 (1990). [Google Scholar]

[CR16] 16.Lang, G., Li, Q., Cai, M., Yang, T. & Xiao, Q. Incremental approaches to knowledge reduction based on characteristic matrices. Int. J. Mach. Learn. Cybern.8, 203–222 (2017). [Google Scholar]

[CR17] 17.Liang, J., Wang, F., Dang, C. & Qian, Y. A group incremental approach to feature selection applying rough set technique. IEEE Trans. Knowl. Data Eng.26, 294–308 (2012). [Google Scholar]

[CR18] 18.Sun, L., Zhang, X., Qian, Y., Xu, J. & Zhang, S. Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification. Inf. Sci.502, 18–41 (2019). [Google Scholar]

[CR19] 19.Feng, Y., Hua, Z. & Liu, G. Partial reduction algorithms for fuzzy relation systems. Knowl.-Based Syst.188, 105047 (2020). [Google Scholar]

[CR20] 20.Theerens, A. & Cornelis, C. Fuzzy rough sets based on fuzzy quantification. Fuzzy Sets Syst.473, 108704 (2023). [Google Scholar]

[CR21] 21.Alnoor, A. et al. Toward a sustainable transportation industry: Oil company benchmarking based on the extension of linear diophantine fuzzy rough sets and multicriteria decision-making methods. IEEE Trans. Fuzzy Syst.31, 449–459 (2022). [Google Scholar]

[CR22] 22.Riaz, M. & Hashmi, M. R. Linear diophantine fuzzy set and its applications towards multi-attribute decision-making problems. Journal of Intelligent & Fuzzy Systems37, 5417–5439 (2019). [Google Scholar]

[CR23] 23.Yang, X., Chen, H., Li, T. & Luo, C. A noise-aware fuzzy rough set approach for feature selection. Knowl.-Based Syst.250, 109092 (2022). [Google Scholar]

[CR24] 24.Ye, J., Zhan, J. & Xu, Z. A novel multi-attribute decision-making method based on fuzzy rough sets. Computers & Industrial Engineering155, 107136 (2021). [Google Scholar]

[CR25] 25.He, J. et al. Attribute reduction in an incomplete categorical decision information system based on fuzzy rough sets. Artif. Intell. Rev.55, 5313–5348 (2022). [Google Scholar]

[CR26] 26.Zhang, K. & Dai, J. Redefined fuzzy rough set models in fuzzy covering group approximation spaces. Fuzzy Sets Syst.442, 109–154 (2022). [Google Scholar]

[CR27] 27.Deng, Z. et al. Feature selection for label distribution learning based on neighborhood fuzzy rough sets. Appl. Soft Comput.169, 112542 (2025). [Google Scholar]

[CR28] 28.Wang, C., Huang, Y., Shao, M. & Fan, X. Fuzzy rough set-based attribute reduction using distance measures. Knowl.-Based Syst.164, 205–212 (2019). [Google Scholar]

[CR29] 29.Zhang, X., Mei, C., Chen, D. & Li, J. Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy. Pattern Recogn.56, 1–15 (2016). [Google Scholar]

[CR30] 30.Qian, W., Huang, J., Wang, Y. & Shu, W. Mutual information-based label distribution feature selection for multi-label learning. Knowl.-Based Syst.195, 105684 (2020). [Google Scholar]

[CR31] 31.Qiu, Z. & Zhao, H. A fuzzy rough set approach to hierarchical feature selection based on hausdorff distance. Appl. Intell.52, 11089–11102 (2022). [Google Scholar]

[CR32] 32.Sun, Y. & Zhu, P. Online group streaming feature selection based on fuzzy neighborhood granular ball rough sets. Expert Syst. Appl.249, 123778 (2024). [Google Scholar]

[CR33] 33.An, S. et al. Relative fuzzy rough approximations for feature selection and classification. IEEE Transactions on Cybernetics53, 2200–2210 (2023). [DOI] [PubMed] [Google Scholar]

[CR34] 34.Liang, P., Lei, D., Chin, K. & Hu, J. Feature selection based on robust fuzzy rough sets using kernel-based similarity and relative classification uncertainty measures. Knowl.-Based Syst.255, 109795 (2022). [Google Scholar]

[CR35] 35.Zhang, Y., Wang, C., Huang, Y., Ding, W. & Qian, Y. Adaptive relative fuzzy rough learning for classification. IEEE Trans. Fuzzy Syst.32, 6267–6276 (2024). [Google Scholar]

[CR36] 36.Chen, X., Lai, L. & Luo, M. A novel fusion and feature selection framework for multisource time-series data based on information entropy. IEEE Trans. Neural Netw. Learn. Syst. (2025). [DOI] [PubMed]

[CR37] 37.Wang, C., Qian, Y., Ding, W. & Fan, X. Feature selection with fuzzy-rough minimum classification error criterion. IEEE Trans. Fuzzy Syst.30, 2930–2942 (2021). [Google Scholar]

[CR38] 38.Xu, W. & Bu, Q. Matrix-based incremental feature selection method using weight-partitioned multigranulation rough set. Inf. Sci.681, 121219 (2024). [Google Scholar]

[CR39] 39.Zhao, J. et al. Consistency approximation: Incremental feature selection based on fuzzy rough set theory. Pattern Recogn.155, 110652 (2024). [Google Scholar]

[CR40] 40.Wang, T., Sun, B. & Jiang, C. Kernelized multi-granulation fuzzy rough set over hybrid attribute decision system and application to stroke risk prediction. Appl. Intell.53, 24876–24894 (2023). [Google Scholar]

[CR41] 41.Sang, B. et al. Feature selection for dynamic interval-valued ordered data based on fuzzy dominance neighborhood rough set. Knowl.-Based Syst.227, 107223 (2021). [Google Scholar]

[CR42] 42.Yu, J. & Xu, W. Incremental knowledge discovering in interval-valued decision information system with the dynamic data. Int. J. Mach. Learn. Cybern.8, 849–864 (2017). [Google Scholar]

[CR43] 43.Xu, W., Yuan, K. & Li, W. Dynamic updating approximations of local generalized multigranulation neighborhood rough set. Appl. Intell.52, 9148–9173 (2022). [Google Scholar]

[CR44] 44.Sang, B., Chen, H., Yang, L., Li, T. & Xu, W. Incremental feature selection using a conditional entropy based on fuzzy dominance neighborhood rough sets. IEEE Trans. Fuzzy Syst.30, 1683–1697 (2021). [Google Scholar]

[CR45] 45.Wang, L., Pei, Z., Qin, K. & Yang, L. Incremental updating fuzzy tolerance rough set approach in intuitionistic fuzzy information systems with fuzzy decision. Appl. Soft Comput.151, 111119 (2024). [Google Scholar]

[CR46] 46.Zhang, X., Liu, X. & Yang, Y. A fast feature selection algorithm by accelerating computation of fuzzy rough set-based information entropy. Entropy20, 788 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR47] 47.Zhang, X. & Li, J. Incremental feature selection approach to interval-valued fuzzy decision information systems based on -fuzzy similarity self-information. Inf. Sci.625, 593–619 (2023). [Google Scholar]

[CR48] 48.Zhao, J., Ling, Y., Huang, F., Wang, J. & See-To, E. W. Incremental feature selection for dynamic incomplete data using sub-tolerance relations. Pattern Recogn.148, 110125 (2024). [Google Scholar]

[CR49] 49.Qi, Z., Li, H., Liu, F., Chen, T. & Dai, J. Fusion decision strategies for multiple criterion preferences based on three-way decision. Information Fusion108, 102356 (2024). [Google Scholar]

[CR50] 50.Xu, W., Yuan, Z. & Liu, Z. Feature selection for unbalanced distribution hybrid data based on k-nearest neighborhood rough set. IEEE Transactions on Artificial Intelligence5, 229–243 (2023). [Google Scholar]

[CR51] 51.Gao, Y., Chen, D., Wang, H. & Shi, R. Optimization attribute reduction with fuzzy rough sets based on algorithm stability. IEEE Trans. Fuzzy Syst. (2023).

[CR52] 52.Sang, B., Xu, W., Chen, H. & Li, T. Active antinoise fuzzy dominance rough feature selection using adaptive k-nearest neighbors. IEEE Trans. Fuzzy Syst.31, 3944–3958 (2023). [Google Scholar]

[CR53] 53.Chen, J., Lin, Y., Mi, J., Li, S. & Ding, W. A spectral feature selection approach with kernelized fuzzy rough sets. IEEE Trans. Fuzzy Syst.30, 2886–2901 (2021). [Google Scholar]

[CR54] 54.Dong, L., Wang, R. & Chen, D. Incremental feature selection with fuzzy rough sets for dynamic data sets. Fuzzy Sets Syst.467, 108503 (2023). [Google Scholar]

[CR55] 55.Huang, W., She, Y., He, X. & Ding, W. Fuzzy rough sets-based incremental feature selection for hierarchical classification. IEEE Trans. Fuzzy Syst.31, 3721–3733 (2023). [Google Scholar]

[CR56] 56.Demšar, J. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res.7, 1–30 (2006). [Google Scholar]

[CR57] 57.Ulker Senkulak, B., Kanoglu, A. & Ozcevik, O. Simurg_cities conceptual model: Multi-dimensional and multi-layer performance-based assessment of urban sustainability at the city level. Chall. Sustain13, 425–444 (2025). [Google Scholar]

[CR58] 58.Utomo, B., Soedarto, T., Winarno, S. & Hendrarini, H. Predicting the success of coffee farmer partnerships using factor analysis and multiple linear regression. Org. Farming11, 61–71 (2025). [Google Scholar]

[CR59] 59.Baidalina, S. et al. Enhancing nutritional value and production efficiency of feeds through biochemical composition optimization. Org. Farming10, 80–93 (2024). [Google Scholar]

PERMALINK

A block matrix incremental feature selection method based on fuzzy rough minimum classification error

Zhanwei Chen

Minggang Xing

Juan Li

Abstract

Introduction

Table 1.

Preliminaries

Definition 1

Definition 2

Definition 3

Definition 4

Definition 5

Fig. 1.

A fuzzy rough model based on minimum classification error

Fig. 2.

Definition 6

Table 3.

Table 2.

Definition 7

Definition 8

Fig. 3.

Property 1

Proof

Definition 9

Definition 10

Example 1

Theorem 1

Proof

Theorem 2

Proof

Theorem 3

Proof

Remark 1

Fig. 4.

Definition 11

Example 2

Algorithm 1.

Example 3

Incremental method based on block matrix

Fig. 5.

Theorem 4

Proof

Theorem 5

Proof

Theorem 6

Proof

Theorem 7

Proof

Example 4

Table 4.

Algorithm 2.

Experimental results and analysis

Table 5.

Inner product relevance analysis

Table 6.

Table 7.

Fig. 6.

Fig. 7.

Fig. 8.

Fig. 9.

Comparison of static feature selection algorithms

Table 8.

Fig. 10.

Table 9.

Table 10.

Fig. 11.

Comparison of feature selection algorithms under sample increase

Fig. 12.

Table 11.

Table 12.

Fig. 13.

Fig. 14.

Fig. 15.

Conclusions

Author contributions

Funding

Data availability

Declarations