Abstract
Recognition and identification of abbreviations is an important, challenging task in clinical natural language processing (NLP). A comprehensive lexical resource comprised of all common, useful clinical abbreviations would have great applicability. The authors present a corpus-based method to create a lexical resource of clinical abbreviations using machine-learning (ML) methods, and tested its ability to automatically detect abbreviations from hospital discharge summaries. Domain experts manually annotated abbreviations in seventy discharge summaries, which were randomly broken into a training set (40 documents) and a test set (30 documents). We implemented and evaluated several ML algorithms using the training set and a list of pre-defined features. The subsequent evaluation using the test set showed that the Random Forest classifier had the highest F-measure of 94.8% (precision 98.8% and recall of 91.2%). When a voting scheme was used to combine output from various ML classifiers, the system achieved the highest F-measure of 95.7%.
Introduction
A pervasive characteristic of clinical texts is their frequent use of abbreviations.1 A study examining physician-entered inpatient admission notes at New York Presbyterian Hospital (NYPH) showed that abbreviations constituted 17.1% of total word tokens in those documents.2 Healthcare professionals use abbreviations as a convenient way to represent long biomedical words and phrases. Those abbreviations often contain important healthcare information (e.g., naming about diseases, drugs and procedures), which must be recognizable and accurate in health records. Nevertheless, studies have shown that confusion caused by the frequently changing and highly ambiguous abbreviations impedes effective communication among healthcare providers and patients3,4,5, potentially diminishing healthcare quality and safety.6 The 2008 National Patient Safety Goals released by the Joint Commission required that U.S hospital not use a list of dangerously ambiguous abbreviations. Recent research suggests that the medical errors caused by the use of unapproved abbreviations might be reduced using a clinical natural language processing (NLP) system.7
Any NLP system attempting to extract clinical information from free text must recognize and identify abbreviations correctly. However, doing so is challenging due to two key issues. First, abbreviation generation remains highly dynamic – clinicians often create their own de novo abbreviations that vary by context (e.g., the type of clinical notes involved). Second, abbreviations often carry ambiguous connotations – one abbreviation may suggest multiple meanings (e.g., “pt - patient” and “pt - physical therapy”). Therefore, disambiguation methods are needed to determine the correct meaning of an abbreviation in a given context. Only a small number of previous studies have evaluated the performance of clinical NLP systems on handling abbreviations.
A comprehensive lexical database containing all clinical abbreviations and their possible senses could enhance clinical NLP systems. None yet exists. Some current knowledge bases, such as the National Library of Medicine’s UMLS (Unified Medical Language System), contain important clinical abbreviations, but are not complete. One study showed that UMLS and its derivative abbreviations only covered about 67% of abbreviations in typed hospital admission notes.2 While potentially useful manually extracted abbreviation lists exist (such as that of Berman8 with over 12,000 clinical abbreviations and their senses from pathology reports, and that of Stetson9 with 3000 clinical abbreviations extracted from physicians’ NYPH sign-out notes), such collections are time-consuming to build and difficult to keep up-to-date. Therefore, automated methods that can detect abbreviations and their senses from a large corpus could potentially provide value in NLP contexts.
In the biomedical literature, abbreviations usually co-occur together with their expanded forms (often with one or the other enclosed in parentheses) at least once in a document. Various approaches exist that extract abbreviations with their senses using the “long form/short form” patterns.10,11, Kuo et al.12 used machine-learning methods to determine whether candidate “long form/short form” patterns defined abbreviations. They reported a precision of 92% and recall of 81% using the biological literature. Unfortunately, abbreviations in clinical documents do not usually occur alongside their long forms. This makes recognition and disambiguation tasks more difficult. A possible approach to building a representative clinical abbreviation/sense database would involve first detecting candidate abbreviations from a large set of actual clinical documents, and then find possible senses of an abbreviation using a clustering-based method.13
Researchers have applied machine learning methods to detect abbreviations from general English texts. Toole14 described a decision-tree based method to identify abbreviations from words that were not recognized by his NLP system. He reported a precision of 91% on recognizing abbreviations within free text from an Air Safety Reporting System database. Previously, Xu et al. conducted an initial study applying a decision tree method to automated abbreviation detection within hospital admission notes.2 The pilot evaluation used four annotated admission notes and showed encouraging results (precision 91% and recall 80% using the evaluation level of “ALL”, see the Evaluation section), indicating the potential usefulness of machine-learning-based (ML-based) abbreviation detection methods.
This paper presents a more extensive study of ML-based abbreviation detection using hospital discharge summaries, examining a number of ML algorithms. Our data set involved 70 discharge summaries, within which abbreviations were manually labeled by three clinical domain experts not familiar with the ML algorithms (authors JD, STR, and RM). We compared three different ML algorithms for abbreviation detection: Decision Trees (DT), Support Vector Machines (SVMs) as well as Random Forests (RF). We also included new features for abbreviation detection, such as vowel and consonant information, to potentially improve abbreviation detection. Furthermore, we developed a voting scheme that combined the outputs from four individual classifiers and evaluated its performance in detecting abbreviations compared to the individual detectors.
Methods
This study defined an abbreviation as any type of shortened term in clinical text, including acronyms (“MI - Myocardial Infarction”), shortened words or phrases (e.g., “pt - patient”), and symbols (e.g., “etoh - alcohol”). Figure 1 provides an overview of the study design. Three physicians (authors JD, STR, RAM) manually annotated abbreviations in 70 discharge summaries. We randomly split the data set into a training set (40 documents) and a test set (30 documents). Annotated abbreviations were treated as positive samples and other tokens (words or non-abbreviation alphanumeric letter groupings set off by white spaces or punctuation marks that annotators did not classify as abbreviations) were treated as negative samples. The classification task comprised determining if a word token was an abbreviation or not based on the expert classifications. The study evaluated three different ML algorithms, including DT, SVMs, and RF. We optimized the feature sets and parameters for each ML algorithm by varying them and testing the results on the training set using a 10-fold cross validation. Then, we applied the optimized ML-based abbreviation detection methods to the independent test set and reported their performance.
Figure 1:
An overview of the study design.Seventy documents were randomly broken into three groups of 30, of which ten documents were the same for all three annotators. The 70 annotated documents were randomly broken into a training set (40 documents) and a test set (30 documents). The features and parameters of ML methods were optimized using a 10-fold cross validation on the training set. The final ML models trained on the entire training set were used to predict abbreviations from the testing set.
Data set
In this study, we used clinical documents taken from Vanderbilt Medical Center’s Synthetic Derivative (SD) database15, which contains de-identified copies of the electronic health record documents at Vanderbilt University Hospital. From SD, we collected 10 years (1999–2008) of discharge summaries (a total of 560,650 documents), from which 70 documents were randomly selected. The study was approved by the Vanderbilt Institutional Review Board. Abbreviations found in those documents were annotated in a sequential manner. Each document was pre-processed by a program that automatically labels abbreviations using a reliable abbreviation dictionary (described below). Then each document with automatically labeled abbreviations was displayed to one or more physician reviewers in a web-based annotation interface, where the annotator could highlight new abbreviations or remove wrongly labeled annotations introduced by the pre-processing program. Three physicians carried out the annotation. Each of them manually annotated abbreviations in 30 discharge summaries, in which 10 documents were identical for all three experts. Based on the overlapped 10 documents, we reported the inter-annotator agreement (IAA) using the KAPPA measurement implemented in Stata 9.2.
ML-based Abbreviation Detection Methods
The study used all expert-annotated abbreviations as positive samples, and non-abbreviation tokens (i.e., token either unmarked by experts or specifically marked as not being abbreviations), excluding punctuation marks) as negative samples, to train a classifier. For the DT and RF algorithms, we used implementations in Weka16 3.6. For the SVMs, we used the libsvm module17. Five different categories of features were tested in this study:
Word formation features, which include: a) special characters such as “-” and “.”; b) alphabetic/numeric characters and their combination; c) features derived from the position of numbers and letters; d) information about upper case and positions in the word; e) length of the word; f) misspelling feature derived using Aspell18: for the tokens not in the English word list or the medical term list, we calculated the edit distance between the token and the top three candidate suggestions provided by Aspell. A threshold of 1 was used to discriminate likely misspelled words and words less likely to be misspelled according to the edit distance.
Features derived from the combination of vowel letters and consonant letters, which include: a) whether all letters are vowels; b) whether all letters are consonants; c) whether both vowel letters and consonant letters are present.
Features from knowledge bases, such as considering whether a word is an English word or a medical term (see the detailed description of English list and medical term list in the last paragraph of this section).
Features from the base comprehensive corpus: The word frequency feature was derived from the total of 560,650 discharge summaries. It is defined as the total number of occurrences of the word over the number of documents in the corpus.
Features from local context: The features derived from the token before and after the current word, including whether the previous/next word is: the beginning of a sentence, a digit or number, an English word or a medical term, punctuation, or the end of a sentence.
We tested different combinations of feature sets and optimized them using the training set. Parameters for each machine learning algorithm were also optimized on the training set using a 10-fold cross validation.
In addition, we developed a baseline method that used two dictionaries of English words and medical words. If a token was not in any of the two dictionaries, we classified it as a potential abbreviation. The English word list was from Knuth’s list of 110,573 American English words.19 The medical words dictionary was generated in a 3-step method. At step 1, we collected all the words from the UMLS Metathesaurus, resulting in 982,088 words in total. For step 2, we constructed a list of 54,627 unique abbreviations from all available knowledge sources of abbreviations, such as UMLS LRABR, ADAM11, and Berman’s abbreviation lists.8 Finally, we removed abbreviations identified in step 2 from the medical words generated by step 1, resulting in 962,473 words to serve as the medical words dictionary (excluding known abbreviations).
Combined Methods using Simple Voting Schemes
As methods that combine outputs from different classifiers often improve performance (e.g., for named entity recognition tasks20), we also investigated combinatorial methods for abbreviation detection. Three individual classifiers were combined using different voting schemes to predict abbreviations from the test set. Three different voting schemes were used: 1) If a token was predicted as an abbreviation once, it was taken an abbreviation; 2) If a token was predicted as an abbreviation by any two methods, it was taken as an abbreviation; 3) If a token was predicted as an abbreviation by all the three methods, it was taken as an abbreviation
Evaluation
Outputs from different abbreviation detection methods were compared to the reference standard; we report precision, recall, and F-measure for each method. Precision is defined as the ratio between the number of abbreviations correctly predicted by the system and the number of all abbreviations predicted by the system. Recall is defined as the ratio between the number of abbreviations correctly predicted by the system and the number of all abbreviations in the reference standard. F-measure21 was calculated as: 2 * Precision * Recall/(Precision + Recall).
Parameters and feature sets were optimized on the training set of 40 documents using a 10-fold cross validation. We report the results with different feature combinations on the training set as well. Finally we trained the models on the entire training set and applied them to the test set of 30 documents. Two sets of gold standards were used to evaluate the performance on the test set. The first one, called gold standard 1 (GS-1), was the initial annotation by physicians, through the sequential annotation method described above. The second gold standard (GS-2) was a revision to GS-1. We collected all the discrepancies between the system output and the GS-1, and presented the samples to the physicians again. They manually reviewed those discrepant samples again and made their final judgment, which led to the second gold standard (GS-2).
To understand different methods better, we report the performance of each method at three different levels: 1) All abbreviations (ALL): we counted every occurrence of any abbreviation in the test set, which means that duplicate abbreviations would be included as separate instances; 2) Unique abbreviations (Unique): we considered each individual abbreviation only once, irrespective of how often it appeared in the documents under analysis; 3) Unknown abbreviations (Unknown): this grouping included only unique abbreviations from the test set that did not occur in the training set. The intent was to evaluate how well a method could detect unknown abbreviations.
Result
Characteristics of the Data Set
According to the expert annotation, the training set contained 18,225 tokens, where 16,839 were negative (not abbreviations) and 1,386 were positive samples (abbreviations). There were 448 unique abbreviations in the test set, where 229 were unknown abbreviations that did not appear in the training set. The detailed numbers of positive samples and negative samples in the training set and test set are shown in Table 1.
Table 1:
Negative and positive samples in training set and test set.
| Number of Documents | Positive Samples (abbreviations) | Negative Samples (non-abbreviations) | |||
|---|---|---|---|---|---|
| All | Unique | All | Unique | ||
| Training set | 40 | 1,386 | 475 | 16,839 | 3,736 |
| Test set | 30 | 1,402 | 448 | 12,511 | 3,060 |
Per the experts’ annotation, the average number of abbreviations in each clinical document was 41. Based on annotations on the 10 overlapped documents, the KAPPA value among the three experts’ annotations was 0.886.
Parameters and Feature Sets Optimization using the Training Set
Different combinations of feature sets were tested and their performance on the training set is shown in Table 2. The numbers in Table 2 are averages from a 10-fold cross validation. Parameters of the ML algorithms were optimized based on F-measures. The cross validation results showed that the best combination includes all features except the context ones (1+2+3+4). The optimized parameters for each method using the best combination of features were reported as follows. For DT, the pruning confidence parameter was 0.1 and the minimum number of instances was 2. For RF, the number of trees was 300; the number of features to be considered was 4 and the seed for random value generator was set to default value 1. For SVM, the penalty parameter (c) was 128; the kernel parameter (g) was 8; the weight for positive sample (w1) was 2.5, others were set to default.
Table 2:
Results of cross validation on training set using different combination of features.
| Methods | 1 | 1+2 | 1+2+3 | 1+2+3+4 | 1+2+3+4+5 | |
|---|---|---|---|---|---|---|
| DT | Precision | 78.5% | 78.2% | 90.8% | 90.4% | 90.1% |
| Recall | 53.6% | 71.0% | 80.1% | 91.7% | 89.6% | |
| F-score | 63.7% | 74.4% | 85.1% | 91.0% | 89.9% | |
| RF | Precision | 79.4% | 78.5% | 92.3% | 93.1% | 93.2% |
| Recall | 54.1% | 73.0% | 83.2% | 94.4% | 92.3% | |
| F-score | 64.4% | 75.7% | 87.5% | 93.7% | 92.8% | |
| SVM | Precision | 67.7% | 70.1% | 83.2% | 89.1% | 93.5% |
| Recall | 62.0% | 80.8% | 90.7% | 92.7% | 74.7% | |
| F-score | 64.7% | 75.1% | 86.8% | 91.0% | 83.1% |
Description of feature sets: 1 - word formation features; 2 - vowel feature; 3 knowledge-based features; 4 - word frequency (global feature); 5 context features derived from the previous/next token.
Results of the ML-based methods using the Test Set
Using the optimized parameters, we trained all the models on the entire training set and predicted on the test set. The results showed that the three machine learning methods achieved better performance than the baseline method. RF reached the best precision of 95.2% and recall of 91.0% on the initial gold standard (GS-1). After experts revised the gold standard (GS-2), the precision and recall of RF reached 98.8% and 91.2% for “ALL abbreviations”. However, SVM showed better results on detecting “Unique” and “Unknown” abbreviations, with F-measures of 89.6% and 83.2% respectively. Table 3 shows the detailed results using GS-1 and table 4 shows the results using GS-2.
Table 3:
Detection on the test set using the initial gold standard (GS-1).
| Method | Precision | Recall | F-score | |
|---|---|---|---|---|
| DT | All | 94.1% | 89.4% | 91.6% |
| Unique | 91.1% | 81.0% | 85.7% | |
| Unknown | 83.1% | 76.4% | 79.6% | |
| RF | All | 95.2% | 91.0% | 93.1% |
| Unique | 94.0% | 81.1% | 87.1% | |
| Unknown | 87.3% | 69.3% | 77.2% | |
| SVM | All | 92.7% | 90.8% | 91.8% |
| Unique | 94.5% | 82.3% | 88.0% | |
| Unknown | 88.6% | 72.5% | 79.8% | |
| Baseline | All | 81.6% | 85.3% | 83.4% |
| Unique | 71.4% | 85.1% | 77.6% |
Table 4:
Detection on the test set using the revised gold standard (GS-2).
| Method | Precision | Recall | F-score | |
|---|---|---|---|---|
| DT | All | 98.0% | 89.8% | 93.7% |
| Unique | 95.2% | 82.9% | 88.6% | |
| Unknown | 90.8% | 78.3% | 84.1% | |
| RF | All | 98.8% | 91.2% | 94.8% |
| Unique | 96.7% | 81.8% | 88.7% | |
| Unknown | 93.1% | 71.2% | 80.7% | |
| SVM | All | 98.1% | 91.3% | 94.5% |
| Unique | 97.5% | 82.9% | 89.6% | |
| Unknown | 94.8% | 74.1% | 83.2% | |
| Baseline | All | 86.8% | 86.0% | 86.4% |
| Unique | 73.7% | 87.3% | 79.9% |
Results of the Combined Methods using the Test set
The combined method using voting scheme 1 achieved the highest F-measures of 95.7% for “ALL” abbreviations and 86.1% for “Unknown” abbreviations on GS-2. Table 5 shows the results of combined methods that used different voting schemes.
Table 5:
Detection on the test set using different voting schemes. (GS-2)
| Schemes | Precision | Recall | F-score | |
|---|---|---|---|---|
| Scheme 1 | All | 97.0% | 94.5% | 95.7% |
| Unique | 93.0% | 90.0% | 91.6% | |
| Unknown | 86.9% | 85.2% | 86.1% | |
| Scheme 2 | All | 98.6% | 92.0% | 95.2% |
| Unique | 98.2% | 83.2% | 90.1% | |
| Unknown | 94.4% | 75.8% | 84.1% | |
| Scheme 3 | All | 99.5% | 85.8% | 92.2% |
| Unique | 98.5% | 73.8% | 84.4% | |
| Unknown | 96.7% | 61.5% | 75.5% |
Discussion
We conducted an extensive study on detecting abbreviations from clinical text, by investigating different machine learning algorithms, different types of features, as well as a new combination strategy. The combined method achieved the highest F-measure of 95.7%, which was superior to the previously reported simple decision tree based method.2
Three physicians had good IAA with a Kappa of 0.8864, suggesting that it is not a difficult task for physicians to recognize abbreviations in clinical text. However, the pre-annotating method used in this study could have helped improving agreement among annotators. We reported performance for both all abbreviations and unique abbreviations because we observed ambiguity in many abbreviations. For example, the word “mom” (meaning of “mother”) was not an abbreviation most of time. However, it was labeled as an abbreviation once, with a meaning of “milk of magnesia”. This was a case of ambiguity between an abbreviation and an English word. Similar examples include “is (immune system)” and “hit (heparin induced thrombocytopenia)”. This issue also makes it difficult to build a reliable English word list and a reliable abbreviation list. The ambiguity between abbreviations and English/medical words becomes one of the bottlenecks of the baseline method.
We analyzed errors in the abbreviation detection methods. Some false negatives were caused by multi-word abbreviations, such as “B. cepacia”. The current method did not handle multi-word abbreviations. These words were broken into single tokens and our system would miss them, which caused false negative errors. Some shortened abbreviations, such as “approx”, were missed by our system. They were hard to detect, because they were similar to ordinary English words. Ambiguous abbreviations were other sources of errors, such as “mom”, which can be both an English word and an abbreviation as mentioned earlier. To correctly detect such ambiguous abbreviations involves sophisticated disambiguation methods, which is beyond the scope of this study.
Detecting unknown abbreviations is a more challenging task and is our next focus. Although RF showed good performance on detecting all abbreviations, its performance dropped when unique and unknown abbreviations were used in the analysis. SVM showed better performance for detecting unique and unknown abbreviations, indicating it could be a more robust ML algorithm for this task. The method to combine different classifiers showed good improvement for detecting unknown abbreviations (F-measure from 84.1% to 86.1%). We will further investigate more sophisticated combination methods for this purpose. In addition, it is also interesting to analyze those unknown abbreviations and determine which are clinically important for recognition.
The performances of all classifiers were improved on GS-2, which resolved discrepant samples among three physicians. We further analyzed these discrepant samples and found that most of them were annotation errors, e.g., sometimes physicians missed true abbreviations and did not annotate them. This indicates that manual annotation of abbreviations is not error-free and by combining manual annotation with outputs from ML-based detection programs could provide more accurate gold standard data sets.
The RF model trained on the entire 70 documents was used to predict abbreviations from all of the 10-year set discharge summaries. After a simple normalization process, which converts abbreviations to lower cases and removes “.”, a total of 26,257 distinct abbreviations were detected, with a total occurrence of 24,120,704. Among 26,257 abbreviations, 9,774 abbreviations only occurred once, accounting for only 0.04% of total abbreviation occurrences. However, 2,777 abbreviations that occurred more than 100 times accounted for 99.3% of total abbreviation occurrences. 19,115 unique abbreviations were not covered by the abbreviation list that we generated from available knowledge bases (containing 54,627 unique abbreviations). Figure 2 shows the word frequency distribution over all the predicted abbreviations, showing a typical distribution of zipf’s law.22
Figure 2:
Distribution of abbreviations in entire discharge summaries
The method proposed in this paper can be used to develop a complete list of abbreviations occurred in clinical text, thus providing valuable lexicon sources to clinical NLP systems. In addition, it will be useful to computerized document entry systems that aim at providing real-time concept encoding functions – it can recognize an unknown abbreviation while a physician is typing it and raise appropriate flags for identification.
Conclusion
In this paper, we presented a corpus-based method to detect clinical abbreviations using machine-learning methods. Evaluation using a set of manually annotated abbreviations from hospital discharge summaries showed it could detect abbreviations with an F-measure of 95.7%, indicating such methods could be applied to clinical corpora to create a useful lexical source of abbreviations.
Acknowledgments
This study was supported by grant from the NLM R01LM010681. The datasets used were obtained from Vanderbilt University Medical Center’s Synthetic Derivative, which is supported by institutional funding and by the Vanderbilt CTSA grant 1UL1RR024975-01 from NCRR/NIH.
References
- 1.Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF. Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform. 2008:128–44. [PubMed] [Google Scholar]
- 2.Xu H, Stetson PD, Friedman C. A study of abbreviations in clinical notes. AMIA Annu Symp Proc; 2007. pp. 821–5. [PMC free article] [PubMed] [Google Scholar]
- 3.Dawson KP, Capaldi N, Haydon M, Penna AC. The paediatric hospital medical record: a quality assessment. Aust.Clin Rev. 1992;12(2):89–93. [PubMed] [Google Scholar]
- 4.Manzar S, Nair AK, Govind Pai M, Al-Khusaiby S. Use of abbreviations in daily progress notes. Arch Dis Child Fetal Neonatal Ed. 2004;89(4) doi: 10.1136/adc.2003.045591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Sheppard JE, Weidner LC, Zakai S, Fountain-Polley S, Williams J. Ambiguous abbreviations: an audit of abbreviations in paediatric note keeping. Arch.Dis.Child. 2008;93(3):204–6. doi: 10.1136/adc.2007.128132. [DOI] [PubMed] [Google Scholar]
- 6.Walsh KE, Gurwitz JH. Medical abbreviations: writing little and communicating less. Arch.Dis.Child. 2008;93(10):816–7. doi: 10.1136/adc.2008.141473. [DOI] [PubMed] [Google Scholar]
- 7.Myers JS, Gojraty S, Yang W, Linsky A, Airan-Javia S, Polomano RC. A randomized-controlled trial of computerized alerts to reduce unapproved medication abbreviation use. J Am Med Inform Assoc. 2011;18(1):17–23. doi: 10.1136/jamia.2010.006130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Berman JJ. Pathology abbreviated: a long review of short terms. Arch Pathol Lab Med. 2004;128(3):347–52. doi: 10.5858/2004-128-347-PAALRO. [DOI] [PubMed] [Google Scholar]
- 9.Stetson PD, Johnson SB, Scotch M, Hripcsak G. The sublanguage of cross-coverage. Proc AMIA Symp; 2002. pp. 742–6. [PMC free article] [PubMed] [Google Scholar]
- 10.Liu H, Lussier YA, Friedman C. A study of abbreviations in the umls. Proc AMIA Symp; 2001. pp. 393–7. [PMC free article] [PubMed] [Google Scholar]
- 11.Zhou W, Torvik VI, Smalheiser NR. Adam: another database of abbreviations in medline. Bioinformatics. 2006;22(22):2813–8. doi: 10.1093/bioinformatics/btl480. [DOI] [PubMed] [Google Scholar]
- 12.Kuo CJ, Ling MH, Lin KT, Hsu CN. Bioadi: a machine learning approach to identifying abbreviations and definitions in biological literature. BMC Bioinformatics. 2009;10(Suppl 15):S7. doi: 10.1186/1471-2105-10-S15-S7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Xu H, Stetson PD, Friedman C. Methods for building sense inventories of abbreviations in clinical notes. J Am Med Inform Assoc. 2009;16(1):103–8. doi: 10.1197/jamia.M2927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Toole J. A Hybrid Approach to the Identification and Expansion of Abbreviations. 2000 [Google Scholar]
- 15.Roden DM, Pulley JM, Basford MA, Bernard GR, Clayton EW, Balser JR, Masys DR. Development of a large-scale de-identified dna biobank to enable personalized medicine. Clin Pharmacol Ther. 2008;84(3):362–9. doi: 10.1038/clpt.2008.89. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Witten I, Frank E. Data Mining: Practical machine learning tools and techniques. 2nd Edition. 2005. [Google Scholar]
- 17.Chang C-C, Lin C-J. LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/cjlin/papers/libsvm.pdf.
- 18. http://aspell.net/.
- 19. http://rabbit.eng.miami.edu/dics/knuthus.txt.
- 20.Torii M, Hu Z, Wu CH, Liu H. Biotagger-gm: a gene/protein name recognition system. J Am Med Inform Assoc. 2009;16(2):247–55. doi: 10.1197/jamia.M2844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Van Rijsbergen C. J Information Retrieval. Butterworth-Heinemann; 1979. [Google Scholar]
- 22.Zipf GK. Human Behavior and the Principle of Least E1ort. 1949.


