Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2019 Jul 25.

Published in final edited form as: J Biomed Inform. 2017 Jul 8;72:85–95. doi: 10.1016/j.jbi.2017.07.006

Illustration of LSTM model. a) The building blocks – LSTM memory cell. The operator “;” denotes vector concatenation, σ(·) and tanh⁡(·) refer to the element-wise sigmoid and hyperbolic tangent functions, and * is the element-wise multiplication. The f_t, i_t, o_t are the values of the forget gate, input gate and output gate respectively, ĉ_t is the candidate value for the cell state, W_f, W_i, W_c, W_o are weight matrices and b_f, b_i, b_c, b_o are bias vectors associated with them. b) The sentence level LSTM model architecture for relation classification. Each LSTM block corresponds to the memory cell structure in a). Each input s_t, t = 1,…,n has a dimension of d_emb that is the word embedding size, plus two numbers p₁, p₂ corresponding to the distances of the current word to concept 1 and concept 2 respectively. Each LSTM memory cell output h_t has a dimension of n_hu, which are then pooled to produce a feature vector of dimension n_hu as well. The pooling output can be regarded as the hidden units, which are input to the softmax layer that produces the label y for relation classification.