Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2013 Nov 16;2013:1512–1521.

A Literature-Based Assessment of Concept Pairs as a Measure of Semantic Relatedness

T Elizabeth Workman 1, Graciela Rosemblat 1, Marcelo Fiszman 1, Thomas C Rindflesch 1
PMCID: PMC3900161  PMID: 24551423

Abstract

The semantic relatedness between two concepts, according to human perception, is domain-rooted and reflects prior knowledge. We developed a new method for semantic relatedness assessment that reflects human judgment, utilizing semantic predications extracted from PubMed citations by SemRep. We compared the new method to other approaches utilizing path-based, statistical, and context vector methods, using a gold standard for evaluation. The new method outperformed all others, except one variation of the context vector technique. These findings have implications in several natural language processing applications, such as serendipitous knowledge discovery.

Introduction

The notions of semantic relatedness and similarity entail an association between two concepts. This association could be syntactically or lexically-based, taxonomic (i.e., IS-A), or dependent on sharing a given number of features. As described in the literature by Pedersen et al. (1) and Resnik (2), semantic relatedness is a more general concept than semantic similarity. Pedersen asserts that semantic similarity addresses the “likeness” between concepts, while semantic relatedness, further qualified as a function of human judgment, corresponds to a more general sense of the association between concepts. Similarly, Thompson-Schill et al. described associative relatedness between two concepts as the probability that one would remind a person of the other (3). This probability, according to these authors, more strongly correlates with word use than with word meaning.

Humans can assess the degrees of relatedness between concept pairs on the basis of the prior knowledge and expertise they bring to the judgment activity. This is demonstrated through the variations in judgment exhibited by different groups for the relatedness of concept pairs. For example, in Pedersen’s research, there were significant differences in the relatedness values that physicians and medical coders assigned to concept pairs. The accuracy of a relatedness measurement in a given domain can be assessed according to how closely its values correlate with human judgment derived from individuals within that domain.

In this paper we introduce a new method for assessing semantic relatedness between concepts. We sought an appraisal technique of relatedness that was fundamentally centered on human judgment, and exploit semantic predications produced by SemRep (4) as having that characteristic. Semantic predications capture relationships between specific concepts according to the published biomedical research record, and thus provide a general assessment of this subject according to human perception. Predications express two independent concepts with one binding relation which expresses the semantic nature of the relationship. Thus, predications are relational by nature. We hypothesized that a simple snapshot of semantic predication frequencies would provide a reliable measurement for semantic relatedness between individual concepts. Here we explore how this new method may provide insight to clinician’s perceptions of semantic relatedness

Semantic relatedness assessment has multiple practical applications in natural language processing (NLP). It is a common function in information retrieval (5). It has also been applied to question answering (6), clinical decision support (7), business intelligence (8), data mining (9), among other uses. Semantic relatedness assessment also has potential application in facilitating serendipitous knowledge discovery (10).

Background

SemRep

Using the UMLS (11) knowledge sources designed for the biomedical domain, SemRep (4) an NLP application, captures concepts connected by a relation or predicate in PubMed data, returning semantic predications as output, representative of asserted propositions in the text. Predications consist of three pieces of information expressed in propositional subject_verb_object form, in which the subject and object arguments correspond to UMLS Metathesaurus concepts, and are joined by a licensed relation from the UMLS Semantic Network. For example, taking the following text as input:

“…The current standard treatment for advanced hepatocellular carcinoma (HCC) is sorafenib.”

(12)

SemRep produces the following semantic predication:

Sorafenib TREATS Primary carcinoma of the liver cells

Through MetaMap (13), SemRep identifies “sorafenib” and “hepatocellular carcinoma” as primary arguments in the text, and respectively maps them to the preferred terms sorafenib and Primary carcinoma of the liver cells in the UMLS Metathesaurus. SemRep identifies the term “treatment” in the text as the relationship between these arguments, and maps it to the predicate TREATS within the UMLS Semantic Network. It is interesting to note that Semantic Space Models, as defined by Padó and Lapata (14) leverage various data from a given corpus to identify semantic relationships. SemRep also uses essential information implicit in PubMed data to identify semantic relationships and articulate rich assertions from PubMed text.

Relational Concept Assessment

Semantic relatedness can also be viewed as a function of how concepts are connected by relationships in text. By nature, this method reflects human assessment of semantic relatedness within the biomedical domain, because it expresses knowledge recorded in published literature. In Methods, we explain how semantic predication frequencies can be used for semantic relatedness assessment.

Literature Review

There are many methods for assessing the semantic similarity and relatedness between two concepts. Here we highlight three popular techniques. Path-based methods measure how concepts are connected in hierarchical structures such as ontologies, and these values may be augmented with some statistical assessment. Other approaches use purely statistical techniques. Context vectors have also been tested in research that is particularly relevant to issues of human judgment of semantic relatedness in the biomedical domain.

Path-based

Path-based methods measure distances in hierarchical concept structures that connect concepts in different types of relationships, such as “IS-A”, sibling-type, and others. The root contains general concepts, and the hierarchy organizes ever-narrower related concepts into a tree-like structure, until the leaves of the assembly consist of the most specific available terms. In such a hierarchy, edges connect concept nodes. Path-based techniques assess the edge network that connects two concepts. The fundamental method is to count the edges to find the shortest path connecting two concepts in a hierarchy. Utilizing MeSH as a semantic network, Rada et al. (15) measured similarity between two concepts by using a technique averaging minimum path length in pairwise combinations while utilizing subsets of nodes. Leacock and Chodorow (16) scaled shortest path values in WordNet (17) with a value equaling twice the hierarchy’s maximum depth, and calculating the logarithm of the result. Choi and Kim (18) used a hierarchy concept tree to calculate a weighted concept distance that was later multiplied by concept level values and normalized. McInnes et al. (19) used the reciprocal of the node count of the shortest path value to calculate the semantic similarity between UMLS-based Metathesaurus concepts. To facilitate their task and overcome the lack of uniformity in programming platforms and standards used in the field, McInnes et al. developed two open source frameworks that extract path and concept information from the UMLS: the UMLS-Similarity and UMLS-Interface packages. Nguyen and Al-Mubaid (20) used calculations based on the most specific ancestral term shared by two concepts, called the least common subsumer (LCS). They calculated the log of two plus the product of the shortest distance between the two concepts, minus one, and the taxonomy’s depth minus the concepts’ LCS’ depth. Wu and Palmer (21) scaled the path from an LCS to the root with the sum of the distances between the LCS and the concepts being measured. Resnik (2) calculated the information content of a concept by determining the negative log value of the ratio of the concept’s frequency and a hierarchy’s root frequency in a corpus. In order to determine the similarity of two given concepts, he applied this information content formula to the LCS for the two concepts. Jiang and Conrath (22) and Lin (23) adjusted the information content of the LCS with values of the individual concepts; Jiang and Conrath did this by applying the combined information content of the individual concepts, minus twice that of their LCS, while Lin divided this “common” (LCS) information content score by a value representative of the individual concepts’ information content.

Pedersen et al., McInnes et al., and Nguyen and Al-Mubaid applied several of these methods in determining how closely they paralleled actual human judgment in the biomedical domain. Pedersen asked medical coders and physicians to rate the semantic similarity of 30 concept pairs, using a scale from 1 – 4, where one equaled unrelated, 2 equaled marginally related, 3 equaled related, and 4 equaled practical synonymy. These concept pairs consisted primarily of disease/disease combinations. Pedersen applied five of the path-based methods (as well as context vectors) to determine the semantic similarity of 29 of the concept pairs, and then compared the results to the physicians’ and medical coders’ ratings (they excluded one pair because one concept was not found in SNOMED CT, a crucial component of their research). Nguyen and Al-Mubaid applied five path-based methods to 25 of the concept pairs (the remaining 5 did not exist in MeSH 2006, which they used in their research), and also compared results to the physicians’ and medical coders’ ratings. McInnes et al. validated their open-source frameworks by reproducing Pedersen’s results, and compared their results to those of Nguyen and Al-Mubaid.

Statistical

Statistical methods evaluate quantitative term properties to assess semantic relatedness. These methods often examine individual term frequencies and term adjacency in corpora. A frequent goal of this research is to calculate some probability of semantic similarity between two terms. Much of this research centers on metrics that measure some notion of distance, and use classical distance measurements such as L1 Norm distance, or apply metrics stemming from information theory such as the Kullback-Leibler Divergence and Information Radius. Dagan et al.(24) compared Maximum Likelihood Estimate (MLE) to four similarity measurements: Kullback-Leibler Divergence, Information Radius (i.e., Total Divergence to the Average), L1 Norm, and Confusion Probability, to identify words that were semantically similar to others based on common contexts. The similarity measurements consistently outperformed MLE, with the Information Radius measurement outperforming all others. Terra and Clark (25) compared the effectiveness of the Chi-Squared Test, Point Mutual Information, Likelihood Ratio, Average Mutual Information, Information Radius, L1 Norm, and Cosine Point Mutual Information in experiments seeking the best synonym for a target word, with Point Mutual Information achieving the best overall performance. Cha (26) offers a multidisciplinary review of many statistical methods for measuring similarity and distance, including L1 Norm distance, inner product metrics such as basic cosine, Jaccard and Dice coefficients, several measurements stemming from Information Theory, and several chi-squared measures.

The Information Radius (equation 1 below), L1 Norm Distance (equation 2 below), and Pointwise Mutual Information (equation 3 below) are of particular interest, due to their high performance, and in L1 Norm’s case, its commonality. We also applied them in our research. Information Radius (IRad) utilizes averaged probability measurements from the Kullback-Leibler Divergence (KLD):

KLD(PQ)=vpi(v)log2(pi(v)/qi(v))AVG=½(P(w|c1)+P(w|c2))IRad(c1,c2)=KLD(P(w|c1)||AVG)+KLD(P(w|c2)||AVG) 1.

Here, c1 and c2 reference the two concepts being compared, and w’ is a contextual word found within a predetermined window (e.g., within the same line of text, or within the same paragraph). Kullback and Leibler devised their metric to determine the degree that one set of inputs (distribution P) deviates from the other (distribution Q). Information Radius provides a more balanced view of how the two distributions are different. Therefore, applying this metric should provide insight to the differences of how two potentially similar terms are distributed in a corpus. Dagan’s application utilized contextual terms. Averaging the measurements avoids otherwise difficult situations where the contextual word does not occur with one of the concepts, thus yielding a zero value, which is problematic for KLD.

L1 Norm Distance (L) measures the accumulative difference between values associated with two concepts. Smaller scores imply greater concept relatedness:

L(c1,c2)=v|P(w|c1)P(w|c2)| 2.

As with Information Radius, w’ references a contextual word found within a predetermined area. For example, P(w’|ci) (for both IRad and L) could refer to the amount of times the contextual word w’ occurred within the predetermined window with the concept, as compared to the total amount of windows within the corpus.

Pointwise Mutual Information (PMI), on the other hand, uses measurements of each of the two concepts being appraised, without considering any common contextual words. It assumes that each concept appears independently, therefore positive values indicate a degree to which words co-occur more than would be expected (under the independence assumption):

PMI(c1,c2)=log2(p(c1,c2)/p(c1)*p(c2)) 3.

For example, the term p(c1, c2) (the numerator value) can refer to each time both of the two concepts in question occurred within a predetermined window, as compared to the total amount of windows within the corpus. For instance, if the two terms both appeared within 11 windows, within a corpus containing 700 windows, p(c1, c2) = 11/700 = 0.0157. The term p(ci) can refer to the relative frequency of an individual concept in the corpus. For example, if concept c1 occurred 29 times within a corpus of 1200 words, p(c1) = 29/1200 = 0.0242. To find the denominator value, one would multiply these values for each concept.

Context Vectors

A variation of Latent Semantic Indexing (27, 28), context vectors are used to analyze word co-occurrence in high dimensional space. Several researchers have applied it to semantic similarity and relatedness (2931). Using the Mayo Clinic Corpus of Clinical Notes, Patwardhan et al. (32) implemented an adaptation of Schutze’s work (33) to create word vectors capturing term occurrences within a contextual window. Using preset thresholds, they then created context vectors of SNOMED CT concepts corresponding to terms within the clinical notes corpus, and also extracted and applied descriptor terms from the Mayo Clinical Thesaurus in this step. The semantic similarity of two concepts was determined through measuring the cosine of the angles of their contextual vectors. In Pedersen’s work with the 29 pairs, he cites a context vector method limited to Impression/Report/Plan clinical note data as achieving the best results. Clinical notes document encounters between clinicians and patients. The Impression/Report/Plan data section includes information on diagnoses and current treatment.

Methods

For the Relational Concept Method, we calculated the frequencies of semantic predications containing each of the 29 concept pairs Pedersen used, for all relationships binding the concept pairs. We extracted the semantic predications from a comprehensive database of SemRep output for all PubMed citations published from 1900 to November 2012 (34) containing nearly 60 million predications. We then counted the semantic predications containing each concept pair. For example, for the pair “Congestive heart failure” and “Pulmonary edema” we extracted semantic predications containing these two terms, for all relationships binding the two concepts. Here are some example semantic predications for this concept pair:

  • Congestive heart failure_CAUSES_Pulmonary Edema

  • Pulmonary Edema_COMPLICATES_Congestive heart failure

  • Pulmonary Edema_COEXISTS_WITH_Congestive heart failure

  • Congestive heart failure_COEXISTS_WITH_Pulmonary Edema

In the extraction phase we used the UMLS Metathesaurus Concept Unique Identifiers (CUIs) that McInnes had used in her research. CUIs are codes identified with UMLS preferred terms that carry precise definitions of their given concepts.

According to our hypothesis, the predication frequencies were indicative of a general level of semantic relatedness for their given concept pairs, within the biomedical domain. Therefore, comparing these frequencies of concept co-occurrence to a list of the same concept pairs ranked for semantic relatedness should hypothetically result in substantial correlation.

We also used the three statistical methods to assess semantic relatedness, using semantic predication data as input. For Information Radius and L1 Norm Distance we used the relative frequencies of the predicates bound to each concept as input, because this provided contextual information. To calculate Pointwise Mutual Information, we used the counts of semantic predications containing each concept in a pair, divided by the total number of predications in the database, for the numerator, and individual concept counts in the database as compared to the total of all concepts in the database, for concept values in the denominator.

Evaluation

We compared the correlation of the output of Relational Concept Assessment, the three statistical methods, the path-based methods, and the context vector methods to the physicians’ ratings of the 29 concept pairs. We used Spearman’s Rank Correlation Coefficient to measure the correlation for Relational Concept Assessment and the three statistical methods, to facilitate comparisons to other research. Using Spearman’s Rank Correlation Coefficient, Pedersen, and Nguyen and Al-Mubaid calculated correlation scores for the path-based and context vector methods they tested. For comparative purposes, we also cite these scores (as recorded in Pedersen’s, and Nguyen and Al-Mubaid’s research) in the Results section.

Results

Relational Concept Assessment achieved better correlation to the physicians’ concept pair ratings than all but one of the other methods tested. It outscored these other methods by a value range of 0.022 – 0.642 points. The top performing method, context vectors limited to Impression/Report/Plan clinical data, exceeded Relational Concept Assessment value by 0.174 points.

Table 1 indicates the concept pairs and physicians’ relatedness scores, semantic predication frequencies, and statistical method scores. Data highlighted in gray were not used in Nguyen and Al-Mubaid’s research.

Table 1.

Concept pairs, physician ratings, semantic predication frequencies in database, Pointwise Mutual Information (PMI), Information Radius (IRad), and L1 Norm values. Items in gray indicate data not used in Nguyen and Al-Mubaid’s research.

Term 1 Term 2 Physicians’ Ratings Semantic Predication Frequency PMI IRad L1 Norm
Renal failure Kidney failure 4 59 4.629 0 0
Heart Myocardium 3.3 108 4.040 0.020 0.177
Abortion Miscarriage 3 5 6.521 0.163 0.410
Congestive heart failure Pulmonary edema 3 28 4.283 0.198 0.621
Stroke Infarct 3 210 3.976 0.301 0.683
Delusion Schizophrenia 3 141 7.186 0.255 0.695
Metastasis Adenocarcinoma 2.7 47 1.222 0.175 0.634
Calcification Stenosis 2.7 0 0 0.263 0.582
Diarrhea Stomach cramps 2.3 0 0 0.196 0.592
Mitral stenosis Atrial fibrillation 2.3 65 5.482 0.056 0.187
Brain tumor Intracranial hemorrhage 2 8 4.705 0.162 0.395
Rheumatoid arthritis Lupus 2 1 −0.421 0.086 0.422
Diabetes mellitus Hypertension 2 304 3.148 0.038 0.269
Carpal tunnel syndrome Osteoarthritis 2 2 0.449 0.196 0.531
Acne Syringe 2 0 0 0.106 0.303
Pulmonary fibrosis Lung cancer 1.7 1 3.363 0.152 0.375
Pulmonary embolus Myocardial infarction 1.7 16 1.266 0.104 0.313
Antibiotic Allergy 1.7 28 1.284 0.336 0.659
Cortisone Total knee replacement 1.7 0 0 0.411 0.640
Lymphoid hyperplasia Laryngeal cancer 1.3 0 0 0.718 1.185
Cholangiocarcinoma Colonoscopy 1.3 0 0 0.383 0.579
Depression Cellulitis 1 0 0 0.318 0.645
Multiple sclerosis Psychosis 1 17 1.406 0.068 0.225
Xerostomia Alcoholic cirrhosis 1 0 0 0.140 0.464
Peptic ulcer disease Myopia 1 0 0 0.031 0.198
Appendicitis Osteoporosis 1 0 0 0.112 0.372
Hyperlipidemia Metastasis 1 0 0 0.205 0.538
Varicose vein Entire knee meniscus 1 0 0 0.382 0.547
Rectal polyp Aorta 1 0 0 0.440 0.809

Table 2 indicates the correlation of the Relational Concept Assessment method, the three statistical methods, context vector method, and Pedersen’s application of the path-based methods to the 29 concept pairs included in his study. Table 3 reports the correlation of the semantic prediction frequencies, the three statistical methods, and Nguyen and Al-Mubaid’s application of path-based methods to the 25 concept pairs from their study.

Table 2.

Spearman’s Rank Correlation Coefficient, comparison to Pedersen’s findings

Method Correlation with 29 concept pairs
Relational
  Relational Concept Assessment 0.666
Statistical
  Pointwise Mutual Information 0.633
  Information Radius −0.244
  L1 Norm −0.086
Context Vectors
  All Sections 0.62
  Impression/Report/Plan 0.84
Path-based (Pedersen)
  Lin 0.60
  Jiang and Conrath 0.45
  Resnik 0.45
  Path 0.36
  Leacock and Chodorow 0.35

Table 3.

Spearman’s Rank Correlation Coefficient, comparison to Nguyen and Al-Mubaid’s findings

Method Correlation with 25 concept pairs
Relational
  Relational Concept Assessment 0.694
Statistical
  Pointwise Mutual Information 0.662
  Information Radius −0.165
  L1 Norm −0.052
Path-based (Nguyen/Al-Mubaid)
  Path 0.627
  Leacock and Chodorow 0.672
  Choi and Kim 0.560
  Wu and Palmer 0.652
  Nguyen and Al-Mubaid 0.666

The Relational Concept Assessment method achieved the highest correlation for the 25 concept pairs (Nguyen and Al-Mubaid), and the second highest correlation for the 29 concept pairs (Pedersen). The context vector method utilizing only the Impression/Report/Plan data in Pedersen’s research achieved the highest overall correlation. Of the three statistical methods, Pointwise Mutual Information substantially outperformed Information Radius and the L1 Norm with each concept pair set. There are mixed results in the path-based outcomes. Lin’s method achieved the highest score in Pedersen’s study of the 29 pairs. Leacock and Chodorow’s method modestly outperformed the others in Nguyen and Al-Mubaid’s research with the 25 pairs. However, the Relational Concept Assessment method achieved values within the 0.66 – 0.7 point range for both concept pair groups. This new method provides an easily computable semantic relatedness appraisal, and could contribute to various NLP applications.

Discussion

The Relational Concept Assessment method’s performance exceeded that of almost all other methods. Only the context vector method, when limited to Impression/Report/Plan clinical note data, exceeded its performance. This strongly suggests that concept pair frequencies found in SemRep’s semantic predications indicate a general semantic relatedness between concepts according to human judgment within the biomedical domain. This new method provides an easily computable semantic relatedness appraisal.

Pointwise Mutual Information produced values somewhat similar to the new method. However, it is dependent on concept co-occurrence (a property also fundamental to Relational Concept Assessment). Each method leverages concept co-occurrence to derive a value. The Information Radius and L1 Norm methods using predicate frequencies for input values achieved substantially lower scores, which were negative because their values tended to increase as the physicians’ rankings decreased.

The individual semantic predication frequencies in Table 1 give insight to the new method’s performance, and reveal an overall trend matching the physicians’ ratings. All concept pairs receiving a rating of three or higher have associated semantic predication counts. This group has a total of 551 semantic predications associated with six concept pairs, or an average of 92 predications per concept pair. Among the nine concept pairs within the 2.0 – 2.9 (marginally related) range there are 427 related semantic predications, for an average of 47 semantic predications per concept pair. However, 304 of these contain the concept pair “Diabetes mellitus/Hypertension”, and the variety of semantic predications reflects the association between these two common comorbidities. Among the 304 predications, 241 bound the two concepts with the predicate COEXISTS_WITH (35). Other binding predicates including COMPLICATES (36), AFFECTS (37), OCCURS_IN (38), CAUSES (39), and PREDISPOSES (40) further illuminate potential relationships between these disease concepts. This type of specific information could provide additional insight to clinicians pondering complex cases. For the 1.1 – 1.9 ratings’ range, there were 45 semantic predications associated with the six concept pairs, averaging at 7.5 predications per concept pair. This range (1.1 – 1.9) indicates that the physicians judged these pairs to be within an unrelated to a marginally related state, and includes semantic predications such as Pulmonary Embolism_COEXISTS_WITH_Myocardial Infarction (41). The physician annotators ranked eight concept pairs as totally unrelated (i.e., a score of 1). However, there were 17 semantic predications in the database containing the concept pair “Multiple sclerosis/psychosis”. These semantic predications were extracted from text confirming the relationship (42, 43). The three physicians in Pedersen’s research had rheumatology backgrounds, and apparently were not aware of this disease/disease relationship. Overall, the combined semantic predication frequencies match what would be expected within the 1 – 4 ranges of the physicians’ ratings, and in application may provide a clinician with additional disease insight.

The Relational Concept Assessment method holds potential value in appraising concept relatedness in NLP tasks. As earlier noted, semantic relatedness has relevance in several NLP applications, such as question answering, clinical decision support, and information retrieval. Serendipitous knowledge discovery, which is the fortuitous discovery of useful knowledge, may also benefit from efficient semantic relatedness assessment. Research addressing this information seeking behavior has noted that information seekers who experience serendipitous knowledge discovery are often pursuing information that is similar to the new knowledge they coincidentally encounter (44). Therefore, aligning semantically related concepts in a knowledge discovery application may facilitate serendipitous knowledge discovery for users. A future literature-based discovery application that also facilitates this phenomenon may assist clinicians in forming new hypotheses, identifying new disease process mechanisms (45), and finding new applications for existing drugs (46). It also requires less computational overhead than any of the other methods. Relational Concept Assessment merely requires an application to determine semantic predication frequencies, which is facilitated by the database of semantic predications. The other methods necessitate multiple operations to assess semantic relatedness.

Future work

The context vector method exclusively using Impression/Report/Plan clinical note data achieved the highest correlation with the physician rankings. Future exploration of the Relational Concept Assessment method may yield improved correlation if input is limited to specific PubMed data. For example, limiting input to the Results sections of structured abstracts may result in output more focused on the given topic. Future method evaluation with a larger dataset would also further illuminate the new method’s efficiency in semantic relatedness assessment (see details below).

Conclusion

In this study we devised a new method of assessing semantic relatedness between concepts. The new method, Relational Concept Assessment, calculates the frequencies of semantic predications containing both terms in a concept pair. We calculated frequencies for 29 concept pairs developed in prior research by Pedersen et al., drawing from a database of nearly 60 million semantic predications. We compared the new method’s performance to path-based, statistical, and context vector methods, comparing the correlation of each method’s outcome to that of physician ratings of Pedersen’s concept pairs. Relational Concept Assessment outperformed all but one of the other methods (a context vector method using specialized input). The new method could potentially contribute to various NLP tasks that rely on semantic relatedness assessment.

Limits

Results of the statistical tests were limited by the type of contextual data available as provided by semantic predications. Because Dagan, and Terra and Clarke used Information Radius and L1Norm only with contextual information, we also used contextual information by using the predicate relative frequencies for each concept. Pedersen’s concept pairs focused almost entirely on disease/disease topics. Concept pairings with more diverse classifications may have yielded different results. Pakhomov et al. (47) performed a study in 2010 which recorded touch screen pixel positions of 587 concept pairs evaluated by medical residents for semantic relatedness. Future work in this project may include a transformation of these data for further evaluation of our method.

Acknowledgments

We would like to thank John Hurdle, MD, PhD, for valuable editorial assistance. This study was supported in part by the Intramural Research Program of the National Institutes of Health, National Library of Medicine. This research was supported in part by an appointment of the first author to the Lister Hill Center Fellows Program sponsored by the National Library of Medicine and administered by the Oak Ridge Institute for Science and Education.

References

  • 1.Pedersen T, Pakhomov SV, Patwardhan S, Chute CG. Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform. 2007 Jun;40(3):288–99. doi: 10.1016/j.jbi.2006.06.004. [DOI] [PubMed] [Google Scholar]
  • 2.Resnik P. Using information content to evaluate semantic similarity in a taxonomy. IJCAI’95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1; 1995. [Google Scholar]
  • 3.Thompson-Schill SL, Kurtz KJ, Gabrieli JDE. Effects of semantic and associative relatedness on automatic priming. J Mem Lang. 1998;38(4):440–58. [Google Scholar]
  • 4.Rindflesch TC, Fiszman M, Libbus B. Semantic interpretation for the biomedical research literature. In: Chen H, Fuller S, Hersh W, Friedman C, editors. Medical informatics: Knowledge management and data mining in biomedicine. New York: NY: Springer; 2005. pp. 399–422. [Google Scholar]
  • 5.Saruladha K, Aghila G, Raj S, editors. A survey of semantic similarity methods for ontology based information retrieval. 2010 The 2nd International Conference on Machine Learning and Computing, ICMLC 2010; February 9, 2010 – February 11, 2010; Bangalore, India: IEEE Computer Society; 2010. [Google Scholar]
  • 6.Jiang P, Hu H, Ren F, Kuroiwa S, editors. Improved semantic similarity computation in question answering system. Proceedings of the Ninth IASTED International Conference Artificial Intelligence and Soft Computing; 12–14 Sept 2005; Anaheim, CA, USA: ACTA Press; 2006. [Google Scholar]
  • 7.Montani S. Case-based reasoning for managing noncompliance with clinical guidelines. Comp Intell. 2009;25(3):196–213. [Google Scholar]
  • 8.Gong Z, Muyeba M, Guo J. Business information query expansion through semantic network. Enterp Inf Syst. 2010;4(1):1–22. [Google Scholar]
  • 9.Hua Li C, Xiangji Huang J. Spam filtering using semantic similarity approach and adaptive BPNN. Neurocomputing. 2012;92:88–97. [Google Scholar]
  • 10.Workman TE. Framing serendipitous information seeking behavior for facilitating literature-based discovery: A proposed model. J Am Soc Inf Sci Technol. 2013 (in press) [Google Scholar]
  • 11.Lindberg DA, Humphreys BL, McCray AT. The Unified Medical Language System. Methods Inf Med. 1993 Aug;32(4):281–91. doi: 10.1055/s-0038-1634945. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zaanan A, Williet N, Hebbar M, Dabakuyo TS, Fartoux L, Mansourbakht T, et al. Gemcitabine plus oxaliplatin in advanced hepatocellular carcinoma: a large multicenter AGEO study. J Hepatol. 2013 Jan;58(1):81–8. doi: 10.1016/j.jhep.2012.09.006. [DOI] [PubMed] [Google Scholar]
  • 13.Aronson AR. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp. 2001:17–21. [PMC free article] [PubMed] [Google Scholar]
  • 14.Padó S, Lapata M. Dependency-based construction of semantic space models. Comput Linguist. 2007;33(2):161–99. [Google Scholar]
  • 15.Rada R, Mili H, Bicknell E, Blettner M. Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics. 1989;19(1):17–30. [Google Scholar]
  • 16.Leacock C, Chodorow M. Combining local context and WordNet similarity for word sense identification WordNet: An electronic lexical database. Cambridge, MA: MIT Press; 1998. pp. 265–83. [Google Scholar]
  • 17.Fellbaum C. WordNet: an electronic lexical database. Cambridge, MA: MIT Press; 1998. [Google Scholar]
  • 18.Choi I, Kim M, editors. Topic distillation using hierarchy concept tree. Proceedings of the Twenty-Sixth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2003; July 28, 2003 – August 1, 2003; Toronto, Ont., Canada: Association for Computing Machinery; 2003. [Google Scholar]
  • 19.McInnes BT, Pedersen T, Pakhomov SV. UMLS-Interface and UMLS-Similarity : open source software for measuring paths and semantic similarity. AMIA Annu Symp Proc. 2009;2009:431–5. [PMC free article] [PubMed] [Google Scholar]
  • 20.Nguyen HA, Al-Mubaid H, editors. New ontology-based semantic similarity measure for the biomedical domain. 2006 IEEE International Conference on Granular Computing; May 10, 2006 – May 12, 2006; Atlanta, GA, United states: Institute of Electrical and Electronics Engineers and Computer Society; 2006. [Google Scholar]
  • 21.Wu Z, Palmer M, editors. Verb semantics and lexical selection. Proceedings of 32nd Annual Meeting of the Association for Computational Linguistics; 26–30 June 1994; San Francisco, CA, USA: Morgan Kaufmann Publishers; 1994. [Google Scholar]
  • 22.Jiang J, Conrath D, editors. Semantic similarity based on corpus statistics and lexical taxonomy. Proceedings of the 10th international conference on research in computational linguistics; Taipei, Taiwan. 1997. [Google Scholar]
  • 23.Lin D, editor. An information-theoretic definition of similarity. Proceedings of Machine Learning (ICML-98); 24–27 July 1998; San Francisco, CA, USA: Morgan Kaufmann Publishers; 1998. [Google Scholar]
  • 24.Dagan I, Lee L, Pereira FGN. Similarity-based models of word cooccurrence probabilities. Mach Learn. 1999;34(1):43–69. [Google Scholar]
  • 25.Terra E, Clarke CLA. Frequency estimates for statistical word similarity measures. Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1; Edmonton, Canada. Association for Computational Linguistics; 2003. pp. 165–72. 1073477. [Google Scholar]
  • 26.Cha S-H. Comprehensive survey on distance/similarity measures between probability density functions. International Journal of Mathematical Models and Methods in Applied Sciences. 2007;4(1):300–7. [Google Scholar]
  • 27.Dumais S. In: Latent semantic indexing (LSI) and TREC-2. Harman D, editor. Washington, DC: National Institute of Standards and Technology; 1994. pp. 219–230. The Third Text Retrieval Conference (TREC3) (NIST Publication No, 500-225, [Google Scholar]
  • 28.Landauer TK, Dumais ST. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol Rev. 1997;104(2):211. [Google Scholar]
  • 29.Kwantes P. Using context to build semantics. Psychon Bull Rev. 2005;12(4):703–10. doi: 10.3758/bf03196761. [DOI] [PubMed] [Google Scholar]
  • 30.Pantel P, Crestan E, Borkovsky A, Popescu A-M, Vyas V, editors. Web-scale distributional similarity and entity set expansion. 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, Held in Conjunction with ACL-IJCNLP 2009; August 6, 2009 – August 7, 2009; Singapore, Singapore: Association for Computational Linguistics (ACL); 2009. [Google Scholar]
  • 31.Wan S, Angryk RA, editors. Measuring semantic similarity using wordnet-based context vectors. Systems, Man and Cybernetics, 2007 ISIC IEEE International Conference on; 7–10 Oct. 2007.2007. [Google Scholar]
  • 32.Patwardhan S, Pedersen T, editors. Using WordNet-based context vectors to estimate the semantic relatedness of concepts. Trento, Italy: 2006. [Google Scholar]
  • 33.Schütze H. Automatic word sense discrimination. Comput Linguist. 1998;24(1):97–123. [Google Scholar]
  • 34.Kilicoglu H, Shin D, Fiszman M, Rosemblat G, Rindflesch TC. SemMedDB: a PubMed-scale repository of biomedical semantic predications. Bioinformatics. 2012 Dec 1;28(23):3158–60. doi: 10.1093/bioinformatics/bts591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Rosenstock J, Raskin P. Hypertension in diabetes mellitus. Cardiol Clin. 1988 Nov;6(4):547–60. [PubMed] [Google Scholar]
  • 36.Gossain VV, Werk EE, Sholiton LJ, Srivastava L, Knowles HC., Jr Plasma renin activity in juvenile diabetes mellitus and effect of diazoxide. Diabetes. 1975 Sep;24(9):833–5. doi: 10.2337/diab.24.9.833. [DOI] [PubMed] [Google Scholar]
  • 37.White F, Wang L, Jelinek HF. Management of hypertension in patients with diabetes mellitus. Exp Clin Cardiol. 2010 Spring;15(1):5–8. [PMC free article] [PubMed] [Google Scholar]
  • 38.Baba T, Murabayashi S, Aoyagi K, Sasaki K, Imamura K, Kudo M, et al. Prevalence of hypertension in diabetes mellitus--its relation to diabetic nephropathy. Tohoku J Exp Med. 1985 Feb;145(2):167–73. doi: 10.1620/tjem.145.167. [DOI] [PubMed] [Google Scholar]
  • 39.Tuck ML, Stern N. Diabetes and hypertension. J Cardiovasc Pharmacol. 1992;19(Suppl 6):S8–18. doi: 10.1097/00005344-199219006-00003. [DOI] [PubMed] [Google Scholar]
  • 40.Tozawa M, Iseki K, Iseki C, Oshiro S, Higashiuesato Y, Ikemiya Y, et al. Impact of multiple risk factor clustering on the elevation of blood pressure. Hypertens Res. 2002 Nov;25(6):811–6. doi: 10.1291/hypres.25.811. [DOI] [PubMed] [Google Scholar]
  • 41.Friedman D. Myocardial infarction associated with recurrent pulmonary embolism in a young woman. Am J Cardiol. 1962 Apr;9:614–8. doi: 10.1016/0002-9149(62)90082-6. [DOI] [PubMed] [Google Scholar]
  • 42.Reiss JP, Sam D, Sareen J. Psychosis in multiple sclerosis associated with left temporal lobe lesions on serial MRI scans. J Clin Neurosci. 2006 Feb;13(2):282–4. doi: 10.1016/j.jocn.2005.02.017. [DOI] [PubMed] [Google Scholar]
  • 43.Hussain A, Belderbos S. Risperidone depot in the treatment of psychosis associated with multiple sclerosis -- a case report. J Psychopharmacol. 2008 Nov;22(8):925–6. doi: 10.1177/0269881107083997. [DOI] [PubMed] [Google Scholar]
  • 44.Erdelez S. Information encountering on the Internet. Proceedings of the National Online Meeting; Learned Information (Europe). 1996. pp. 101–108. [Google Scholar]
  • 45.Miller CM, Rindflesch TC, Fiszman M, Hristovski D, Shin D, Rosemblat G, et al. A closed literature-based discovery technique finds a mechanistic link between hypogonadism and diminished sleep quality in aging men. Sleep. 2012 Feb;35(2):279–85. doi: 10.5665/sleep.1640. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Weeber M, Vos R, Klein H, De Jong-Van Den Berg LT, Aronson AR, Molema G. Generating hypotheses by discovering implicit associations in the literature: a case report of a search for new potential therapeutic uses for thalidomide. J Am Med Inform Assoc. 2003 May-Jun;10(3):252–9. doi: 10.1197/jamia.M1158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Pakhomov S, McInnes B, Adam T, Liu Y, Pedersen T, Melton GB. Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study. AMIA Annu Symp Proc. 2010;2010:572–6. [PMC free article] [PubMed] [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES