Skip to main content
. 2024 Jun 24;14(13):1339. doi: 10.3390/diagnostics14131339
Algorithm 1. Feature selection using Lasso Regression.
  1. The input is as follows:

     X: Design matrix containing predictor variables (features).
     y: Vector of observed target values.
     λ: Regularization parameter for LASSO regression.
     k: Number of folds for cross-validation.
  • 2.
    Standardize the data:
    •       i.
      Center each feature by subtracting its mean.
    •      ii.
      Scale each feature by dividing by its standard deviation.
  • 3.

    Initialize an empty list to store selected features.

  • 4.
    Perform k-fold cross-validation:
    •       i.
      Split the data into k equal-sized folds.
    •      ii.
      For each fold:
      • a
        Use the remaining (k − 1) folds as the training set and the current fold as the validation set.
      • b
        Fit a LASSO regression model on the training data.
      • c
        Apply an appropriate metric to the validation set to assess the efficacy of the model.
      • d
        Record the coefficients of the LASSO model.
  • 5.

    Find the common measure of efficiency for all folds for each value of λ.

  • 6.

    Select the optimal value of λ that minimises the performance.

  • 7.

    Fit a LASSO regression model on the entire dataset using the selected λ.

  • 8.

    Extract the coefficients of the LASSO model.

  • 9.

    Identify the features with non-zero coefficients and add them to the list of selected features.

  • 10.
    The output is as follows:
    •       i.
      List of selected features.