Skip to main content
. Author manuscript; available in PMC: 2015 Sep 1.
Published in final edited form as: Biometrics. 2014 May 30;70(3):695–707. doi: 10.1111/biom.12191

Table 1.

True risk models and marker distributions for the seven simulation scenarios. The linear logistic regression model logitP(D = 1|T, Y) = Ỹβ1 + TỸβ2, with = (1, Y), and the classification tree including {T, TY, (1 − T)Y} as predictors are evaluated as working models in all scenarios.

Scenario True risk model Marker distribution
The linear logistic working model is correctly specified. 1 logit P(D = 1|T, Y) = 0.3 + 0.2Y1 − 0.2Y2 − 0.2Y3 + T(−0.1 − 2Y1 − 0.7Y2 − 0.1Y3) Y1, Y2, and Y3 are independent N(0, 1)
2 logit P(D = 1|T, Y) = 0.3 + 0.2Y1 − 0.2Y2 − 0.2Y3 + T(−0.1 − 2Y1 − 0.7Y2 − 0.1Y3) Same as Scenario 1 except for 2% of high leverage observations where Y1 ~ Uniform (8, 9)
Link function in the linear logistic working model is incorrectly specified. 3 log{−logP(D = 1|T, Y)} = −0.7 − 0.2Y1 − 0.2Y2 + 0.1Y3 + T(0.1 + 2Y1Y2 − 0.3Y3) Same as Scenario 1
Link function and main effects in the linear logistic working model are incorrectly specified. 4
log{-logP(D=1T,Y)}=2-1.5Y12-1.5Y22+3Y1Y2+T(-0.1-Y1+Y2)
Y1, and Y2 are independent Uniform (−1.5, 1.5)
Main effects and interactions are incorrectly specified in the linear logistic working model. 5
logitP(D=1T,Y)=-0.1-0.2Y1+0.2Y2-0.1Y3+Y12+T(-0.5-2Y1-Y2-0.1Y3+2Y12)
Same as Scenario 1
6 logit P(D = 1|T, Y) = 0.1 − 0.2Y1 + 0.2Y2Y1Y2 + T(−0.5 − Y1 + Y2 + 3Y1Y2)
Linear logistic working model is mis-specified for outlying observations. 7 P(D=1T,Y)=1{Y1<8}11+e-η+1{Y18}(1-11+e-η), where η = 0.3 + 0.2Y1 − 0.2Y2 − 0.2Y3 + T(−0.1 − 2Y1 − 0.7Y2 − 0.1Y2) Same as Scenario 2