Repeat 100 times: |
|
|
Divide the data into 10 outer folds |
|
Repeat 10 times: |
|
|
|
Keep 1 outer fold for testing |
|
|
Select the remaining 9 outer folds for training |
|
|
Divide the 9 outer training folds into 10 inner folds |
|
|
Repeat 10 times: |
|
|
|
|
Keep 1 inner fold for testing |
|
|
|
Select the remaining 9 inner folds for training |
|
|
|
Move all variables into the list of available variables |
|
|
|
Create an empty list of nested model variables |
|
|
|
Iterate this backward selection procedure until only 1 variable is left in the list of available variables: |
|
|
|
|
|
Train Cox models on the inner training set. Each Cox model contains all available variables except of 1 variable at a time |
|
|
|
|
Select the variable that contributes the least to the model likelihood |
|
|
|
|
Move the selected variable from the list of available variables to the top of the list of nested model variables |
|
|
|
|
Move the last available variable to the top of the list of nested model variables |
|
|
|
Iterate over the list of nested variables: |
|
|
|
|
|
Train the Cox model containing the present variable and the variables above it in the list of nested variables using the inner training set. |
|
|
|
|
Evaluate the average time-dependent area under the receiver operating characteristic curve (ATD-AUCROC) h of the present Cox model using the 1 inner testing fold. |
|
|
|
|
Record the variable usage U in the present Cox model and the size n of the model. UX(vm) = 1 if vm is in model X, 0 otherwise. |
|
|
|
Estimate: |
|
|
- the expected model size <n> = ΣX(hX nX)/ΣX(hX) |
|
|
- the (inner) variable stability score for each variable vm: <vm> = ΣX(hx UX(vm))/ΣX(hx) |
|
|
Train the Cox model containing the most stable <n> variables using the outer training set. |
|
Evaluate the ATD-AUCROC k of the present Cox model using the 1 outer testing fold. |
|
Record the variable usage T in the present Cox model and the size s of the model. |
|
TX(vm) = 1 if vm is in model X, 0 otherwise. |