Abstract
Accurate concept identification is crucial to biomedical natural language processing. However, ambiguity is common during the process of mapping terms to biomedical concepts (one term can be mapped to several concepts). A cost-effective approach to disambiguation relating to training is via semantic classification of the ambiguous terms, provided that the semantic classes of the concepts are available and are all different. We propose such a semantic classification based method to disambiguate ambiguous mappings with different semantic type(s), which can be used with any program that maps terms to UMLS concepts. Classifiers for the semantic types were built using abundant features extracted from a huge corpus with terms mapped to UMLS concepts. The method achieved a precision of 0.709, with unique advantages not achievable by the other comparable methods. Our results also demonstrate a need to further investigate the complementary properties of different methods.
Introduction
Biomedical natural language processing (BioNLP) has been a cornerstone of many research and development undertakings, e.g., information retrieval, document classification, relation extraction, and generation of hypotheses.1 A key step in BioNLP is to identify the concepts discussed in text, or more specifically, to map each term to a precise sense defined in a specific standard terminology. Once the correct sense is determined for each term, the benefit of linking to structured databases (e.g., clinical records, images, or microarrays) and ontological knowledge (e.g., semantic network) becomes evident. Consequently, word sense disambiguation (WSD) for biomedicine has been the subject of much research; see Schuemie et al2 for a general review. A widely used tool that performs term-to-concept mapping is the MetaMap3 system developed in the National Library of Medicine (NLM), which maps terms to concepts in the Unified Medical Language System (UMLS).4 Each concept in the UMLS contains a set of synonyms and is associated with semantic type(s), which are categorical semantic annotations assigned by human experts. However, MetaMap frequently provides more than one concept (sense) to each term it maps, resulting from the fact that many terms have more than one meaning and thus are ambiguous. Therefore, a disambiguation step is required in order to achieve the goal of unique sense identification. In this paper we focus on semantic type based WSD methods specifically for MetaMap, but the methods are general as long as a mapping program is available and the senses are associated with a system of semantic classification.
Rindflesch and Aronson5 introduced the notion of disambiguating UMLS concept mappings based on their semantic types. The basic idea is to first determine the preferred semantic type according to the context whenever each candidate mapping is of a different semantic type, then the candidate concept with the preferred semantic type is considered the correct sense within that context. The advantages of semantic type based WSD methods are: 1) the semantic types are manually curated and assigned to every concept, 2) building a classifier for each UMLS semantic type (135 in total) is much more tractable than building classifiers for every ambiguous term, and 3) the problem of sparse training data is diminished. Subsequently, Humphrey et al6 advanced the implementation by using journal descriptors (JD), which are broad topical terms (e.g., “Cardiology” and “Urology”) indexed by the NLM for each journal, to assist in determining the preferred semantic type. The journal descriptor approach works well (with a best precision around 0.787), is applicable to most MEDLINE records, and has recently been incorporated as the WSD module of MetaMap. The research question we propose here is: is it possible to make the semantic type based method more general for any text processed by MetaMap (not only journal articles but also clinical notes)? We experiment with a method that uses the unambiguous mappings from a huge MetaMap-processed corpus to train the semantic type classifiers and present our preliminary results.
Background
Successful BioNLP depends on effective interpretation of the enormous amount of domain-specific terms, especially the precise capture of their meanings in texts. The UMLS, originally developed to integrate biomedical terminologies, is being widely used in the BioNLP community for its comprehensive coverage and its normalization of the terms into concepts as the units of meaning. In addition to aggregating synonyms under the concepts, each concept is annotated with semantic type(s) within a Semantic Network, being a categorization scaffold with various useful semantic relations. The MetaMap program, developed to extract biomedical concepts from text, has been used as a versatile preprocessor of varied biomedical applications; a collection of related publications can be found at NLM’s SKR website.7 Another resource distributed by the NLM is the MEDLINE/PubMed Baseline Repository (MBR),8 of which each annual database represents a static view of all the records in MEDLINE/PubMed. The entire 2005 MBR database was processed by MetaMap and could be used as a richly annotated corpus for BioNLP purposes.
Mapping biomedical terms to corresponding concepts is not a straightforward task, owing to ambiguities among the biomedical concepts as well as general English concepts. As an example of a MetaMap mapping, the term “discharge” can be mapped to either the concept “Discharge, Body Substance” or the other concept “Discharge, Patient Discharge”. The ambiguity cannot be resolved unless contextual information or other external knowledge is applied. Much effort has been reported on developing automatic WSD solutions and on creating standard test sets to evaluate them.2 To facilitate research concerned with disambiguating biomedical terms, the NLM WSD test collection9 was generated from the ambiguous mappings of a MetaMap-processed corpus. The set consists of 50 highly frequent ambiguous terms, e.g., “discharge”. Each of the 50 terms has 100 instances randomly sampled from the corpus, which were manually disambiguated by a team of 11 annotators. Considerable WSD research has been done using that test collection. Most of the existing WSD methods for MetaMap apply term-based supervised training, i.e., for each ambiguous term a classifier is trained by sense-tagged instances. However, preparing such training data is a very labor-intensive process. The required sample size can also be demanding: Liu et al10 suggested that supervised WSD is suitable only when each sense has several dozens of training instances. Therefore, it is worth exploring less costly approaches to WSD.
According to Rindflesch & Aronson5 and Humphrey et al,6 the semantic type based WSD appears to be a promising alternative, with its advantages introduced earlier. To exemplify how it works, assume we can determine that in the sentence “…excessive neuronal discharge was…” the term “discharge” is of the semantic type “Body Substance” rather than “Health Care Activity”. Then the concept “Discharge, Body Substance” associated with the former semantic type should be the correct sense of the term. Although the method is limited to deal with ambiguous mappings where each candidate concept is of a different semantic type, the substantially reduced training load is very promising. The studies mentioned above are the only two we know of that apply semantic type based WSD for mapping biomedical terms to UMLS concepts. In Rindflesch and Aronson’s early pilot study, very finely crafted rules were implemented, achieving a precision of 0.782. However, the rules could not be generated automatically, and only the term “immunology” was evaluated. The JD based method by Humphrey et al, which is completely automated, has been described in the Introduction. Our method differs in that it does not depend on features of journal indexing at the document level and hence is applicable to any type of text. We hypothesized that such independence could be achieved once sufficient contextual features are made available to train the semantic type classifiers. Our approach included abundant contexts extracted from the entire MetaMap-processed 2005 MBR corpus.
Methods
I.Overview
Classifiers for the 135 semantic types were built using features extracted from the MetaMap-processed MBR corpus (~14 million abstracts). Corresponding features of the test instances were extracted from the texts provided in the NLM WSD test collection. To determine the correct sense, each ambiguous instance was automatically classified to the closest semantic type of its candidate senses, and the sense of the closest type was chosen.
II. Building the semantic type classifiers
A program was created to post-process the MetaMap machine-readable format of the MBR corpus, and only the unambiguous mappings (terms mapped to only one concept) were used in our training. For each unambiguous mapping, the adjacent terms and semantic types of the adjacent unambiguous mappings within certain predetermined flanking window size were extracted. Then the semantic type(s) of the center unambiguous mapping used the extracted features for training; see Figure 1 and the explanation below.
Figure 1.
Feature extraction in a sentence fragment
In Figure 1 the center term (with index=0) “stroke” is unambiguously mapped to only one concept “Cerebrovascular accident”. The concept is assigned the semantic type T047 “Disease or Syndrome”, and the adjacent features extracted are used to train T047. Using a window size of 3, the adjacent terms would be “obstructive”, “cause”, and “heparin therapy”. Note that syntactic punctuations (not including those morphologically functional ones such as the hyphen of “insulin-dependent”) and stop words (according to the PubMed list) are discarded, and inflections are reduced (e.g., “caused”3“cause”) using the LVG program developed in the NLM’s SPECIALIST Lexicon project.11 Since MetaMap supports phrase detection, the adjacent terms (e.g., “heparin therapy”) are not just a bag of words, but also contain meaningful biomedical multi-word terms. Both “obstructive” and “heparin therapy” are also unambiguously mapped, and therefore their semantic types T169 “Functional Concept” and T061 “Therapeutic or Preventive Procedure” are also included to train T047. A Naïve Bayesian (NB) model was implemented. NB was chosen because it is less computationally demanding, which is important because of the size of the huge training data. We varied the window size to be 1, 3, 5, and the whole sentence, using the adjacent terms either with or without the adjacent semantic types.
III. Performing disambiguation
The NLM WSD test collection includes the MetaMap machine-readable output of the texts containing the ambiguous instances. For each ambiguous instance we applied the same process described above to extract the adjacent features. The candidate sense whose semantic type received the highest NB classification probability was chosen.
IV. Evaluation
We used the same approach as Humphrey et al in performing our evaluation, except that they tested on a smaller random subset for each ambiguous term. Five ambiguous terms were completely eliminated, either because they have senses corresponding to identical semantic types or they have no candidate senses suggested by MetaMap (i.e., all instances labeled as “None of the above senses” by the annotators). The latter criterion also reduced the number of total test instances for each ambiguous term. For example, 7 out of the 100 instances of the term “adjustment” were eliminated in this way. Recall and precision were computed. We defined recall as the number of untied disambiguations over the tested instances, and precision as the correct ones over the untied disambiguations. Evaluation was conducted over the different combinations of window size and features. We determined a baseline using the semantic type with the highest frequency estimated from the unambiguous mappings of our training corpus. To quantify how much the semantic types added to the pure term-based approach, we computed the value (Precisionterm&type – Precisionterm) for every ambiguous term whenever the term-based method outperformed the baseline.
Results
The results including the baseline and the JD approach are displayed in Table 1. Note that a large standard error of precision is observed for every method, and thus the average should be interpreted carefully. Most approaches achieved full recall, except our method with a window size of 1 and with the whole sentence. The imperfect recall resulted from tied classifications. According to the average and standard error of precision, the optimal window size for our method appears to lie between 3~5 (the term “lead” coincides exactly with the pattern). However, for some terms we observed that performance increases with window size (e.g., “culture”, “depression”, and “reduction”), while others exhibit the opposite tendency (e.g., “frequency” and “sensitivity”). The assessment of the semantic types as extra features is summarized in Table 2. We can see that adding semantic types as features made minor improvement of precision with considerable variance, i.e., for some cases they actually lowered the precision. On average, the JD assisted approach performs the best among the methods, and our method performs around the middle between the JD method and the baseline. For each method we counted the number of terms the highest precision was achieved and obtained the following: baseline=13, JD=21, ours=18. The counts do not sum up to 45 because there are cases where two of the methods achieve the highest precision.
Table 1.
Performance of different methods. For each row the highest precision is boldfaced. Denotations: #– number of instances tested, B– Baseline, JD– Journal Descriptor approach by Humphrey et al, W1– Window size=1, W3– Window size=3, W5– Window size=5, S– Sentence as window, SE– standard error. Under W1~S, the left column used only adjacent terms and the right column used both adjacent terms and semantic types.
| # | B | JD | W1 | W3 | W5 | S | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Average Recall | 1 | 1 | 0.997 | 0.999 | 1 | 1 | 1 | 1 | 0.998 | 0.995 | |
| adjustment | 93 | 0.667 | 0.750 | 0.667 | 0.656 | 0.677 | 0.645 | 0.602 | 0.602 | 0.473 | 0.473 |
| blood_pressure | 100 | 0.540 | 0.418 | 0.490 | 0.470 | 0.500 | 0.500 | 0.520 | 0.530 | 0.410 | 0.410 |
| condition | 92 | 0.978 | 0.932 | 0.957 | 0.935 | 0.957 | 0.946 | 0.967 | 0.957 | 0.870 | 0.870 |
| culture | 100 | 0.110 | 0.985 | 0.350 | 0.330 | 0.560 | 0.500 | 0.620 | 0.620 | 0.747 | 0.737 |
| degree | 65 | 0.969 | 0.955 | 0.985 | 0.985 | 0.969 | 0.969 | 0.923 | 0.846 | 0.922 | 0.873 |
| depression | 85 | 0.000 | 0.947 | 0.047 | 0.071 | 0.259 | 0.318 | 0.306 | 0.412 | 0.667 | 0.714 |
| determination | 79 | 1.000 | 1.000 | 0.987 | 0.987 | 0.975 | 0.975 | 0.962 | 0.962 | 0.987 | 0.987 |
| discharge | 75 | 0.987 | 0.963 | 0.947 | 0.960 | 0.933 | 0.933 | 0.933 | 0.947 | 0.947 | 0.947 |
| energy | 99 | 0.010 | 0.731 | 0.222 | 0.273 | 0.333 | 0.394 | 0.434 | 0.455 | 0.500 | 0.480 |
| evaluation | 100 | 0.500 | 0.582 | 0.580 | 0.570 | 0.700 | 0.730 | 0.710 | 0.740 | 0.750 | 0.750 |
| extraction | 87 | 0.057 | 0.983 | 0.326 | 0.333 | 0.529 | 0.598 | 0.529 | 0.575 | 0.770 | 0.793 |
| failure | 29 | 0.138 | 0.944 | 0.379 | 0.448 | 0.621 | 0.655 | 0.621 | 0.586 | 0.621 | 0.655 |
| fat | 73 | 0.027 | 0.750 | 0.556 | 0.583 | 0.630 | 0.644 | 0.575 | 0.603 | 0.575 | 0.534 |
| fit | 18 | 0.000 | 1.000 | 0.833 | 0.833 | 0.722 | 0.778 | 0.667 | 0.722 | 0.667 | 0.611 |
| fluid | 100 | 0.000 | 0.045 | 0.030 | 0.020 | 0.070 | 0.080 | 0.090 | 0.110 | 0.141 | 0.172 |
| frequency | 94 | 1.000 | 0.905 | 0.830 | 0.787 | 0.415 | 0.394 | 0.436 | 0.394 | 0.383 | 0.372 |
| ganglion | 99 | 0.929 | 0.940 | 0.929 | 0.929 | 0.909 | 0.909 | 0.919 | 0.919 | 0.879 | 0.879 |
| glucose | 100 | 0.910 | 0.388 | 0.890 | 0.890 | 0.860 | 0.860 | 0.830 | 0.830 | 0.790 | 0.800 |
| growth | 100 | 0.630 | 0.702 | 0.650 | 0.660 | 0.690 | 0.710 | 0.680 | 0.690 | 0.670 | 0.687 |
| immunosuppression | 100 | 0.590 | 0.761 | 0.580 | 0.580 | 0.620 | 0.620 | 0.610 | 0.610 | 0.710 | 0.620 |
| implantation | 97 | 0.825 | 0.924 | 0.794 | 0.804 | 0.773 | 0.732 | 0.856 | 0.784 | 0.814 | 0.784 |
| inhibition | 99 | 0.010 | 1.000 | 0.182 | 0.202 | 0.404 | 0.465 | 0.556 | 0.626 | 0.677 | 0.707 |
| japanese | 79 | 0.924 | 0.566 | 0.924 | 0.911 | 0.937 | 0.937 | 0.924 | 0.937 | 0.885 | 0.883 |
| lead | 29 | 0.069 | 0.389 | 0.483 | 0.448 | 0.793 | 0.793 | 0.759 | 0.724 | 0.483 | 0.586 |
| mole | 84 | 0.988 | 0.982 | 0.988 | 0.988 | 1.000 | 0.988 | 1.000 | 0.988 | 0.976 | 0.964 |
| mosaic | 97 | 0.464 | 0.677 | 0.406 | 0.396 | 0.454 | 0.464 | 0.505 | 0.495 | 0.608 | 0.608 |
| nutrition | 89 | 0.315 | 0.387 | 0.393 | 0.404 | 0.371 | 0.427 | 0.371 | 0.371 | 0.348 | 0.364 |
| pathology | 99 | 0.859 | 0.746 | 0.818 | 0.828 | 0.889 | 0.859 | 0.848 | 0.818 | 0.778 | 0.778 |
| pressure | 95 | 1.000 | 0.121 | 0.842 | 0.811 | 0.800 | 0.716 | 0.747 | 0.747 | 0.568 | 0.589 |
| radiation | 98 | 0.378 | 0.803 | 0.551 | 0.561 | 0.571 | 0.582 | 0.612 | 0.633 | 0.592 | 0.561 |
| reduction | 11 | 0.182 | 1.000 | 0.400 | 0.545 | 0.818 | 0.909 | 0.818 | 0.909 | 1.000 | 1.000 |
| repair | 68 | 0.765 | 0.864 | 0.794 | 0.779 | 0.794 | 0.765 | 0.750 | 0.765 | 0.735 | 0.773 |
| resistance | 3 | 0.000 | 1.000 | 0.333 | 0.333 | 0.333 | 0.333 | 0.333 | 0.000 | 0.000 | 0.000 |
| scale | 65 | 1.000 | 0.628 | 0.723 | 0.708 | 0.769 | 0.754 | 0.846 | 0.769 | 0.723 | 0.641 |
| secretion | 100 | 0.010 | 0.940 | 0.090 | 0.100 | 0.050 | 0.090 | 0.110 | 0.130 | 0.150 | 0.131 |
| sensitivity | 51 | 0.961 | 0.829 | 0.824 | 0.745 | 0.804 | 0.784 | 0.765 | 0.706 | 0.549 | 0.529 |
| single | 100 | 0.990 | 0.985 | 0.990 | 0.970 | 0.990 | 0.980 | 0.990 | 0.980 | 0.920 | 0.908 |
| strains | 93 | 0.989 | 0.984 | 0.989 | 0.989 | 0.957 | 0.957 | 0.968 | 0.968 | 0.978 | 0.968 |
| support | 10 | 0.800 | 1.000 | 0.900 | 0.900 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| surgery | 99 | 0.980 | 0.925 | 0.990 | 0.990 | 0.980 | 0.980 | 0.970 | 0.960 | 0.899 | 0.879 |
| transient | 100 | 0.990 | 0.985 | 0.990 | 0.990 | 0.990 | 0.990 | 0.990 | 0.990 | 0.990 | 0.990 |
| transport | 94 | 0.011 | 0.969 | 0.468 | 0.468 | 0.809 | 0.798 | 0.787 | 0.830 | 0.830 | 0.872 |
| ultrasound | 100 | 0.840 | 0.806 | 0.870 | 0.870 | 0.880 | 0.870 | 0.840 | 0.840 | 0.899 | 0.918 |
| variation | 100 | 0.800 | 0.702 | 0.800 | 0.800 | 0.800 | 0.790 | 0.780 | 0.750 | 0.740 | 0.720 |
| white | 90 | 0.456 | 0.533 | 0.689 | 0.700 | 0.778 | 0.767 | 0.778 | 0.778 | 0.811 | 0.800 |
| Average Precision | 0.570 | 0.787 | 0.655 | 0.657 | 0.704 | 0.709 | 0.708 | 0.702 | 0.699 | 0.696 | |
| SE of Precision | 0.400 | 0.243 | 0.289 | 0.280 | 0.252 | 0.240 | 0.234 | 0.242 | 0.240 | 0.239 | |
Table 2.
The second column shows the average (with standard error) of (Precisionterm&type – Precisionterm). Denotations: same as Table 1.
| W1 | 0.008±0.032 |
| W3 | 0.012±0.034 |
| W5 | 0.002±0.075 |
| S | 0.004±0.038 |
Discussion
Like Humphrey et al, we consider our method unsupervised, as there was no sense-tagging process involved to train a classifier for each ambiguous term, and features for training the semantic type classifiers were generated automatically. Our approach represents a self-feedback mechanism of MetaMap, i.e., its output can be used to help its own WSD, which would also be true if a MetaMap-processed clinical corpus were used to train the classifiers. Moreover, benefiting from the huge training corpus, the feature space obtained in this study is more representative than studies of much smaller scale. The various patterns of optimal window size we observed are interesting and unveil complexity that should not be overlooked in WSD research. Instead of suggesting a universal window size, it seems that more intricate factors need to be considered to determine the optimal window size, which preferably could involve a case-by-case approach. We hypothesize that the semantic types do not help in certain cases because of possible noise in the MetaMap mappings (some unambiguous mappings are incorrect) and because of the types’ limited discriminative power for close types. The mapping noise actually has an overall effect on our method, because not only do the adjacent semantic types depend on accurate mappings but the center mapping used as the pivot to determine the adjacent training features also does. A viable solution is to use the UMLS ontological relations to help select reliable concept mappings; the idea was introduced by Liu et al12 in generating a sense-tagged corpus. The solution can enhance our training quality and would be applicable to any MetaMap-processed text.
The highest precision coverage in Table 1 shows that the methods are complementary. Some of the difficult cases discussed by Humphrey et al can be handled by the baseline method with impressive precision, e.g., “glucose” and “pressure”. However, so far we have not found clear patterns to detect which method would have the best performance for a particular term. The NB model is backed up by the priors of the classes (here the semantic types) when the features are sparse, and thus reasonable correlation with the baseline can be observed. Nonetheless, even with window size of 1 our method still outperforms the baseline on average. We achieved the highest precision for 18/45 of the terms tested. Although the way we counted the above coverage is favorable to us in that we took our various configurations as a single method, it is clear (with convincing margin) that for some terms the best precision cannot be achieved by the baseline or the JD approach, e.g., “lead”, “japanese”, and “white”. Among the difficult cases discussed by Humphrey et al, our method achieved the highest precision for “evaluation”, “lead”, and “japanese”. Their failure analysis enumerates those hard-to-differentiate semantic types (e.g., Research Activity v.s. Health Care Activity), and we hypothesize that the failures may have resulted from the weak discriminating power of the journal descriptors for certain terms and semantic types.
In summary, we found no conclusive evidence that suggests a universally optimal parameter for individual WSD methods, nor a universally optimal WSD method. Further systematic failure analysis of the methods is needed to uncover the conditions of applicability so as to enable a solution based on dynamic combination of the optimal modules. Future work also includes: 1) improving the NB classifier (e.g., by appropriate smoothing techniques) and testing other classification algorithms, 2) studying Liu et al’s method to enhance the reliability in selecting our training instances, 3) extending our experiments to terms beyond the NLM test collection and to other types of texts (e.g., clinical notes).
Conclusion
We experimented with a WSD method based on semantic type classification and achieved a best precision of 0.709. Contributions of this research are: we proposed an unsupervised (scalable) method that 1) obtains abundant training features from a huge corpus, 2) is independent of publication type (generalizable), and 3) shows complementary strength in comparison to the other methods. In addition, we revealed that optimal window size varies, which was considered by earlier research to be constant given an optimized classifier. We suggest more research is needed to improve our understanding of what method (and parameters) to use for specific types of ambiguous terms.
Acknowledgments
This work was supported by Grants R01 LM7659 and R01 LM8635 from the NLM.
References
- 1.Cohen AM, Hersh WR. A survey of current work in biomedical text mining. Brief Bioinform. 2005;6(1):57–71. doi: 10.1093/bib/6.1.57. [DOI] [PubMed] [Google Scholar]
- 2.Schuemie MJ, Kors JA, Mons B. Word sense disambiguation in the biomedical domain: an overview. J Comput Bio. 2005;12(5):554–65. doi: 10.1089/cmb.2005.12.554. [DOI] [PubMed] [Google Scholar]
- 3.Aronson AR. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp. 2001:17–21. [PMC free article] [PubMed] [Google Scholar]
- 4.Lindberg DA, Humphreys BL, McCray At. The Unified Medical Language System. Methods Inf Med. 1993;32(4):281–91. doi: 10.1055/s-0038-1634945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Rindflesch TC, Aronson AR. Ambiguity resolution while mapping free text to the UMLS Metathesaurus. Proc Symp Comput Appl Med Care. 1994:240–4. [PMC free article] [PubMed] [Google Scholar]
- 6.Humphrey SM, et al. Word sense disambiguation by selecting the best semantic type based on journal descriptor indexing: preliminary experiment. J Am Soc Inf Sci Technol. 2006;57(1):96–113. doi: 10.1002/asi.20257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Semantic Knowledge Representations/ Papers http://skr.nlm.nih.gov/papers/index.shtml
- 8.MEDLINE/PubMed Baseline Repository (MBR). http://mbr.nlm.nih.gov/
- 9.Weeber M, Mork J, Aronson AR. Developing a test collection for biomedical word sense disambiguation. Proc AMIA Symp. 2001:746–50. [PMC free article] [PubMed] [Google Scholar]
- 10.Liu H, Teller V, Friedman C. A multi-aspect comparison study of supervised word sense disambiguation. J Am Med Inform Assoc. 2004;11(4):320–31. doi: 10.1197/jamia.M1533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.SPECIALIST Lexical Tools http://lexsrv3.nlm.nih.gov/SPECIALIST/Projects/lvg/current/index.html
- 12.Liu H, Johnson SB, Friedman C. Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS. J Am Med Inform Assoc. 2002;9(6):621–36. doi: 10.1197/jamia.M1101. [DOI] [PMC free article] [PubMed] [Google Scholar]

