Skip to main content
. 2020 Apr 9;8:298. doi: 10.3389/fbioe.2020.00298

Algorithm 1.

Training Procedure

  1: Training set {Ci}i=1M, number of training steps T, batch size B.
  2: Initialize the neural net params θ.
  3: Initialize baseline value.
  4: for t = 1 to T do
  5:    Select a batch of samples Ci for i ∈ {1, ⋯ , B}.
  6:    Sample solution πi based on pθ(·|Ci) for i ∈ {1, ⋯ , B}.
  7:    Let gθ= 1Bi=1B[(AC(πi|Ci)-b(Ci))θlogpθ(πi|Ci)].
  8:    Update θ = ADAM(θ, gθ).
  9:    Update baseline b(Ci) = b(Ci)+α(ACi|Ci) − b(Ci)) for i ∈ {1, ⋯ , B}.
10: end for
11: return neural net parameters θ.