Skip to main content
. 2022 Mar 14;12(5):jkac059. doi: 10.1093/g3journal/jkac059

Fig. 2.

Fig. 2.

Pipeline for training the CNN that classifies sentences containing the word “stress.” Terms specific to “system stress” or “cellular stress” were obtained by using the cosine similarity tool in Python’s Gensim library against the word2vec embeddings derived from PubMed and PMC text. Abstracts including these terms were fetched from PubMed. These words were then “tokenized” and were splitted into training and validation sets. Input layer of the model passed the training data to the embedding layer. After a 1D convolutional layer, downsampling is implemented by a maximum pooling layer. Output is flattened and connected to 2 fully connected layers. We use the rectifier unit function to activate the neurons in the convolution layer and the dense layer. Last dense layer is activated by the sigmoid function. The final weights of the model classify input sentences into either system stress or cellular stress.