Skip to main content
. 2020 Oct 21;40(2):369–381. doi: 10.1002/sim.8779

TABLE 1.

Estimands describing the uncertainty of model estimation incurred by variable selection, their approximation by simulation, and their estimation by resampling

Estimand Definition Approximationby simulation Resampling‐based estimator
VIFj
E[I(β^j0)]
q=1QI(β^jq0)/Q
b=1BI(β^jb0)/B
MSF(J)
E[jJI(β^j0)·jJI(β^j=0)]
q=1QjJI(β^jq0)·jJI(β^jq=0)/Q
b=1BjJI(β^jb0)·jJI(β^jb=0)/B
RCBj
E(β^j)βj·VIFj1
q=1Qβ^jqβj·VIFj·Q1
b=1Bβ^jbβ˜j·VIFj^·B1
RMSDRj
E(β^jβj)2E(β˜jβj)2
q=1Q(β^jqβj)2/Qq=1Q(β˜jqβj)2/Q
b=1B(β^jbβ˜j)2/Bσ˜j2

Note: Superscripts q or b indicate estimates obtained in the qth simulated dataset or the bth resample, respectively. Estimates β^j and β^jbare set to 0 if they are not selected by the variable selection algorithm in the corresponding model. For the sets J and J of indices, J ∪ J = {1, …, k}, and J ∩ J = { }. I(·) is the indicator function, that is, it is 1 if the expression (·) is true and 0 otherwise.

Abbreviations: MSF, model selection frequency; RCB, relative conditional bias; RMSDR, root mean squared difference ratio; VIF, variable inclusion frequency.