Skip to main content
. 2024 May 7;11(5):464. doi: 10.3390/bioengineering11050464
Algorithm 2 Ensemble Feature Selection.
  • 1.

    Input: A dataset D with m samples and n features, and a positive integer K indicating the number of features to select.

  • 2.

    Output: A subset of K features that are highly correlated with the target variable but uncorrelated with each other.

  • 3.

    Extract graphical features using fast fractional Fourier transform.

  • 4.

    For i=1 to n:

    • (a)
      Calculate information gain (IG) for feature i using the dataset D and feature A:
      IG(D,A)=H(D)H(D|A)
    • (b)
      Calculate ReliefF score for feature i based on differences between samples:
      ReliefF(i)=j=1mdiff(i,j)m
    • (c)
      Calculate variance score for feature i:
      Variance(i)=1mj=1m(Xj,iX¯i)2
    • (d)
      Calculate NCA score for feature i based on the conditional probability p(i|j):
      NCA(i)=j=1mp(i|j)
    • (e)
      Calculate CFS score for feature i by considering correlations between features and the target variable:
      CFS(i)=cor(Xi,Y)var(Xi)·var(Y)·2cor(Xi,Xi)+cor(Y,Y)
  • 5.

    Combine scores for each feature by taking their average or using a weighted average.

  • 6.

    Select the top-K features based on their scores.