Skip to main content
. 2019 Dec 10;2:122. doi: 10.1038/s41746-019-0194-x

Table 1.

Parameters for sampling images and modeling outcome data.

Variable Variable model
u1 Aggressiveness N(0,0.7071)
u2 Fitness N(0,0.7071)
z Heterogeneity N(0,1)
x Size N(u1u2;0.05)
t Treatment Bern(invlogit(N(u20.5,0.25)))
y Survival N(tz2u10.5;0.05)

For each observation i, an image is drawn from the total pool of images with the closest xi and zi. This ensures the required association between factors of variation in the image and the simulated outcome data. The parametric equations follow the DAG presented in Fig. 1: u1,u2,z are continuous independent noise variables. The collider x is the difference between u1 and u2, with a small amount of Gaussian noise (standard deviation of noise =0.05). u1 and u2 have a standard deviation of 0.707122 to ensure that x has a standard deviation of 1. Treatment t is modeled as a Bernoulli variable with a logistic link function, where increased u2 increases the probability of being treated. 0.5 is subtracted to assure that ~50% of patients are treated. Gaussian noise of standard deviation 0.25 is added to the inverse log-odds of being treated to assure that every patient has some probability of being treated with the more intense treatment. This reflects the clinical world better as some patients may have strong preferences regarding their treatment, regardless of their underlying health status. Overall survival (y) increases with treatment (the true treatment effect is 1) and decreases with heterogeneity in radiodensity and tumor aggressiveness. Again, Gaussian noise of standard deviation 0.05 is added to introduce some uncertainty in the data