Skip to main content
. Author manuscript; available in PMC: 2018 May 1.
Published in final edited form as: J Biomed Inform. 2017 Mar 27;69:75–85. doi: 10.1016/j.jbi.2017.03.016

Table 1.

The key concepts in this paper and their definitions

Concept Definition
Target vocabulary The vocabulary to which we would like to suggest new terms. In this study, our target vocabulary is the CHV.
N-gram A contiguous sequence of n words in a sentence. In this study, we included up to 5-grams, since n-grams with n between 1 to 5 can cover over 99% of the terms of interest [33].
Seed term An n-gram extracted from the text corpus that can be found in the target vocabulary (i.e., CHV).
Candidate term An n-gram that is not covered by the target vocabulary but could be potentially added to the target vocabulary. In order to qualify as a candidate term, an n- gram may be subject to some constraints, e.g., occurring more than 5 times in the corpus.
Term context Either the entire sentence that contains the term, or a window of 10 words before or after the term in its sentence.