Input: Training set X = {(x(1), t(1), y(1)),…, (x(n), t(n), y(n)) }, loss function L(., .), hyperparameter β > 0, generator network G with initial parameters θg, discriminator network D with initial parameters θd, epochs K
|
1. Perform neural network-based propensity score π(X) calculation and matching between the treated samples (t = 1) and control samples (t = 0) using the nearest neighbor algorithm with replacement |
2. Separate out the unmatched control samples from the matched ones |
3. for
epochs = 1, 2, … ,Ks
do
|
4. Sample minibatch of m noise samples where
|
5. Sample minibatch of m unmatched control samples from the distribution pdata(xuc) |
6. Update the discriminator D by ascending its stochastic gradient:
|
7. Sample minibatch of m noise samples where
|
8. Update the generator G by descending its stochastic gradient:
|
9. end for
|
10. Remove the discriminator D and get the generated samples from the generator G and assign them as treated |
11. Train a multitask regression model with the original training set X to obtain outcome labels for the generated treated samples |
12. Merge the original dataset with the synthetic data into an augmented balanced dataset |
13. Train a multitask regression model with the augmented dataset |