Input: and , where is the number of data samples and is the number of features. Depending on the FS algorithm, you may need additional hyper-parameters and/or pre-processing of the data (e.g., discretization). Process
-
1.
For each FS algorithm, create an empty set , which will contain the indices of the selected features.
-
2.
Randomly select 90% of the data samples from the data matrix along with their responses, .
-
3.
Run the FS algorithm using the 90% randomly selected samples. The result is an ordered sequence of features (often you can choose the number of features as the output to save on computational time), where the first feature is considered the most important for the chosen FS algorithm.
-
4.
Repeat steps 2 and 3 multiple times, say , and store the results in a matrix (of size ), In each of the rows of , we store the selected feature subset.
-
5.
Voting to decide on the final feature subset for each FS algorithm: feature indices are incrementally included, one at a time, in . For each of the steps, we find the indices corresponding to the features selected until that step for all the repetitions in step 4 (i.e., use the subset of , where corresponds to the features selected in the first FS steps [in the last step ]).
-
6.
We select the feature index that appears most frequently among these elements and that is also not already included in . This index is now included as the th element in . Ties are resolved by including the lowest index number.
-
7.
Repeat steps 5 and 6 for the number of features we want to ultimately use (i.e., ).
Output: vector with the ordered sequence of selected features in descending order of importance. The indices in this sequence correspond to the columns in the data matrix . |