Skip to main content
. Author manuscript; available in PMC: 2021 Mar 12.
Published in final edited form as: KDD. 2017 Aug;2017:787–795. doi: 10.1145/3097983.3098126

Algorithm 1.

GRAM Optimization

 Randomly initialize basic embedding matrix E, attention parameters ua, Wa, ba, RNN parameter θr, softmax parameters W, b.
repeat
  Update E with GloVe objective function (see Section 2.4)
until convergence
repeat
  X ← random patient from dataset
  for visit Vt in X do
   for code ci in Vt do
    Refer G to find ci’s ancestors C′
    for code cj in C′ do
     Calculate attention weight αij using Eq. (2).
    end for
    Obtain final representation gi; using Eq. (1).
   end for
   vt ← tanh(∑i:ciVt gi)
   Make prediction y^t using Eq. (4)
  end for
  Calculate prediction loss L using Eq. (5)
  Update parameters according to the gradient of L
until convergence