1a) |
-
i)
Calculate separately for observations with and using WGCNA::TOMsimilarityFromExpr (Langfelder & Horvath, 2008)
-
ii)
Euclidean distance matrix of using stats::dist
-
iii)
Run the dynamicTreeCut algorithm (Langfelder et al., 2008; Langfelder, P., Zhang, B., & with contributions from Steve Horvath, 2016) on the distance matrix to determine the number of clusters and cluster membership using dynamicTreeCut::cutreeDynamic with minClusterSize = 50
|
1b) |
-
i)
1st PC or average for each cluster using stat::prcomp or base::mean
-
ii)
Penalized regression model: create a design matrix of the derived cluster representatives and their interactions with E using stats::model.matrix
-
iii)
MARS model: create a design matrix of the derived cluster representatives and E
|
2) |
-
i)
For linear models, run penalized regression on design matrix from Step 1b using glmnet::cv.glmnet (Friedman et al., 2010). Elasticnet mixing parameter alpha=1 corresponds to the lasso and alpha=0.5 corresponds to the value we used in our simulations for elasticnet. The tuning parameter lambda is selected by minimizing 10 fold cross‐validated mean squared error (MSE).
-
ii)
For nonlinear effects, run MARS on the design matrix from Step 1b using earth::earth (Milborrow. Derived from mda:mars by T. Hastie and R. Tibshirani., 2011) with pruning method pmethod = “backward” and maximum number of model terms nk = 1000. The degree=1,2 is chosen using 10 fold cross validation (CV), and within each fold the number of terms in the model is the one that minimizes the generalized cross validated (GCV) error.
|