. Author manuscript; available in PMC: 2021 Aug 11.

Published in final edited form as: Comput Methods Programs Biomed Update. 2021 Jul 16;1:100020. doi: 10.1016/j.cmpbup.2021.100020

Algorithm 1.

PSSAM-GAN: Propensity Score Synthetic Augmentation Matching using Generative Adversarial Networks.

Input: Training set X = {(x⁽¹⁾, t⁽¹⁾, y⁽¹⁾),…, (x⁽ⁿ⁾, t⁽ⁿ⁾, y⁽ⁿ⁾) }, loss function L(., .), hyperparameter β > 0, generator network G with initial parameters θ_g, discriminator network D with initial parameters θ_d, epochs K

1. Perform neural network-based propensity score π(X) calculation and matching between the treated samples (t = 1) and control samples (t = 0) using the nearest neighbor algorithm with replacement

2. Separate out the unmatched control samples from the matched ones

3. for epochs = 1, 2, … ,K_s do

4. Sample minibatch of m noise samples

{z^{(i)}}_{i = 1}^{m}

where

z ~ N (0, 1)

5. Sample minibatch of m unmatched control samples from the distribution p_data(x_uc)

6. Update the discriminator D by ascending its stochastic gradient:

\nabla_{θ_{d}} (ϕ_{d} (D, G))

7. Sample minibatch of m noise samples

{z^{(i)}}_{i = 1}^{m}

where

z ~ N (0, 1)

8. Update the generator G by descending its stochastic gradient:

\nabla_{θ_{d}} (ϕ_{d} (D, G))

9. end for

10. Remove the discriminator D and get the generated samples from the generator G and assign them as treated

11. Train a multitask regression model with the original training set X to obtain outcome labels for the generated treated samples

12. Merge the original dataset with the synthetic data into an augmented balanced dataset

13. Train a multitask regression model with the augmented dataset