Abstract
Words which have different representations but are semantically related, such as dementia and delirium, can pose difficult issues in understanding text. We explore the use of interaction frequency data between semantic elements as a means to differentiate concept pairs, using semantic predications extracted from the biomedical literature. We applied datasets of features drawn from semantic predications for semantically related pairs to two Expectation Maximization clustering processes (without, and with concept labels), then used all data to train and evaluate several concept classifying algorithms. For the unlabeled datasets, 80% displayed expected cluster count and similar or matching proportions; all labeled data exhibited similar or matching proportions when restricting cluster count to unique labels. The highest performing classifier achieved 89% accuracy, with F1 scores for individual concept classification ranging from 0.69 to 1. We conclude with a discussion on how these findings may be applied to natural language processing of clinical text.
Introduction
Word Sense Disambiguation (WSD), a common task in Natural Language Processing (NLP), is the process of determining the precise meaning of an ambiguous word for a given instance in text. For example, the word “cold” in biomedical text can refer either to rhinovirus (e.g., “the common cold affects many people”) or cold temperature (e.g., “the patient felt cold in the room”). In this task, we have a single representation “cold” that can possess one of multiple, distinct definitions, depending on the use of the word. WSD is usually achieved through analyzing the context in which the word appears1. Recent popular biomedical WSD applications include disambiguation of terms2,3 plus abbreviations and acronyms4.
Another difficult but similar task to WSD is to differentiate two terms that have different representations, but similar meanings. For example, dementia and delirium are similar in manifestation, but they are separate, independent conditions. Delirium or Acute Changes in Mental Status is defined as a fluctuating disturbance of cognition (memory, language, orientation) and/or consciousness with reduced ability to focus, sustain, or shift attention5. Delirium may impact 14–56% of all hospitalized elderly patients6, 7, and is associated with poor outcomes, including: increased length of stay, increased likelihood of falls and accidents and discharge to a nursing home. Delirium causes diagnostic dilemmas for physicians often trying to discriminate delirium from dementia in elderly patients, and due to the need for increased monitoring, is often a significant burden on nursing staff8. They are different clinical events and this poses important implications in treatment and other aspects of case management. However, there are inconsistencies in even provider representations. Researchers found that clinicians inconsistently document delirium in veteran electronic health records, even when there is a confirmed diagnosis9. Additionally, they use alternative terms such as “disoriented”, “muttering”, and “showing confusion” in documenting delirium10. Inconsistent documentation in general causes clinical care team members to expend extra effort to validate data11. Prolonged delirium in care settings poses significant risks of cognitive impairment and death12. Therefore, it is vital to identify and treat it quickly. However it is often not recognized, or misdiagnosed as dementia or another psychiatric condition13. Dementia recognition is likewise important, yet early-onset dementia is often misdiagnosed, which can have devastating consequences on patients and their families14.
Data that capture how semantic elements interact in text may help explicate the differences between two separate terms that have similar meanings. This may be especially useful in distinguishing incidents of dementia or delirium in text. This study explores the application of such data to clustering and classifying exercises, as an initial effort to explore the use of it to differentiate two related terms and other NLP tasks.
Background
Semantic predications provide a representation of biomedical text that has been distilled to simple assertions, consisting of the semantic elements expressed in the original text. SemRep15, an NLP application, extracts semantic predications from PubMed title and abstract text. For example, SemRep takes the following text:
“…calcium channel blockers…reduce the risk of dementia…”16
and extracts this semantic predication:
Calcium Channel Blockers (Pharmaceutical Substance) PREVENTS Dementia (Disease or Syndrome)
SemRep identifies the two prominent arguments in the text and their preferred terms from the UMLS Metathesaurus (C0006684 Calcium Channel Blockers and C0497327 Dementia), and the predicate, or relationship that binds them (PREVENTS) as indicated in the UMLS Semantic Network. SemRep also identifies the semantic types, or subject classes of the two arguments, indicated in parentheses. To better define this semantic predication topologically, one could refer to Calcium Channel Blockers as the opposite argument to Dementia, and visa-versa.
There is a resource that provides semantic predications for research and other purposes. The Semantic MEDLINE database17 contains over 84 million semantic predications extracted by SemRep from the biomedical literature from 1900 to 2015.
These data structures may shed light for differentiating delirium and dementia; however, a basic analysis demonstrates substantial contextual similarities between the two terms. Semantic predications containing either term share many co-occurring elements. Delirium, the smaller in terms of total unique semantic elements with which it occurs, shares all of its predicates, 54% of its opposite arguments, and 98% of its opposite argument’s semantic types with Dementia, in semantic predications containing both terms extracted from literature published from 2005 - 2014.
An analytical method such as clustering may provide differentiation between delirium and dementia as they appear in semantic predications. Expectation Maximum (EM) clustering18 was designed to identify latent, or implicit variables from mixture models of shared probability density functions. Concept data drawn from the Semantic MEDLINE database can be characterized as representing a mixture model, where each concept belongs to subpopulations in which it interacts with other semantic elements. To activate instances of the EM algorithm, one must prescribe an amount of clusters presumed to be representative of the data. Algorithms such as Bayesian Information Criterion (BIC) determine an optimal cluster count which then enables the EM clustering process.
Frequency data and its membership in mixture models has previously been exploited in classification19. EM Clustering has previously been applied in a semantic clustering context20, including efforts for building sense inventories for abbreviations in clinical notes21. This work applies EM Clustering and other machine learning techniques to data extracted from the Semantic Medline database.
Objectives
The purpose of this work is to determine if (a) interactive rate data follow a clustering behavior expectant of data drawn from a mixture model, and (b) this data could successfully be used to classify concept sense. Applying interaction values, i.e., representations of how each concept interacts with other semantic elements, to a clustering method, might potentially identify clusters of expected proportions. These values might also be used to train an effective classifier to identify instances of several concepts. Specifically, we wanted to answer the questions:
Do unlabeled datasets of semantically related concept pairs exhibit two distinct clusters of equal or similar proportions when subjected to BIC modeling and EM clustering?
When labels are included, how do data instances cluster?
Could a dataset for several concepts, where there is matching data for another semantically related concept for each concept in the dataset, be applied to training several different types of classifiers? If so, what classifier would perform best?
To explore these questions, we analyzed data from several semantically related concept pairs including delirium and dementia. We first subjected datasets of concept pairs to two clustering processes, the first without the concept labels, then the second with the labels restored. We then applied the data for all concepts to training and testing several classifier algorithms.
Methods
Data Procurance and Preparation
We gathered the following features for each term for a ten-year period (2005-2014) from the Semantic MEDLINE database: frequency of the term appearing as an argument in the database, count of unique terms appearing in the opposite argument position, count of unique semantic types of the terms in the opposite argument position, count of unique predicates, count of unique semantic type-predicate combinations, and count of unique opposite argument-predicate combinations. We retrieved this data according to date of entry (EDAT) into the PubMed database. We aggregated it in one-month intervals to form vectors for these six features in order to provide many instances for the clustering and classifier training tasks. The following example is an instance of this data for Dementia. The features are concept label, concept occurrence frequency, unique opposite argument frequency, unique predicate frequency, unique semantic type frequency of the opposite argument, unique semantic type-predicate combinations, and unique opposite argument-predicate combinations:
Dementia, 160, 39, 12, 19, 29, 45
In other words, for this given month, there were 160 instances of the concept “Dementia”, 39 unique concepts appearing with it in the opposite argument position, 12 unique predicates, 19 unique semantic types represented among the opposite argument concepts, 29 unique semantic type-predicate combinations, and 45 unique opposite argument-predicate combinations. We gathered this data for the following concept pairs (Table 1):
Table 1.
Concept Pairs
| Concept 1 Name | Concept 1 CUI | Concept 2 Name | Concept 2 CUI |
|---|---|---|---|
| Delirium | C0011206 | Dementia | C0497327 |
| Heart | C0018787 | Myocardium | C0027061 |
| Abortion | C0156543 | Miscarriage | C0000786 |
| Congestive Heart Failure | C0018802 | Pulmonary edema | C0034063 |
| Stroke | C0038454 | Infarct | C0021308 |
| Delusion | C0011253 | Schizophrenia | C0036341 |
The additional pairs were rated as significantly semantically related by three clinicians in a former study22.
All but one concept yielded 120 instances. Abortion (C0156543) yielded 116 instances, or, in other words, there were insufficient data for four months of the 10-year collection period to produce data for this study. We did not use any imputation methods to create data for the four missing instances in this dataset.
After retrieving the data, we normalized the values of each quantitative feature for each concept by scaling on a mean of 0 and a standard deviation of 1. For each concept dataset, each quantitative feature was represented in a single vector, forming a total of six vectors with 120 values each, for each dataset, except for the Abortion dataset, where there were six vectors with 116 values each. The corresponding concept label formed an additional feature vector in each dataset; these labels were excluded for the first clustering exercise (described in the following section). For classifier training, we also randomized the instances and split the dataset into a training subset (80%) and a testing subset (20%).
Clustering Analysis
We applied a model-based clustering approach suggested by Fraley23 using the MCLUST software package24 in R25. This technique implemented BIC in combination with EM Clustering and hierarchal agglomeration to find and execute the best model (including cluster count and characteristics) for a given dataset. This is done under the premise that the data in question is drawn from a mixture of multiple probability distributions. In addition to cluster count, output also includes cluster characteristics such as volume, shape, and orientation, according to the best model. We also documented the covariance matrix model implemented in determining the best BIC model (and cluster count). We analyzed each dataset for each concept pair, removing the concept labels from the instances before processing.
The software also included functionality to cluster labeled data for cluster classification by applying Eigenvalue Decomposition Discriminant Analysis. We applied this process to the concept pair data while including concept class labels. This exercise allowed us to view how the concept pair data instances divided into categories when matched against possible classifications.
Classifier Training
Using the data for all 12 concepts, we trained several types of classifiers, specifically a Support Vector Machine, a Random Forest (i.e., tree structure), a Neural Network, and a Naïve Bayes classifier, because these are identified as commonly used in the related task of WSD26. We employed the following R packages for the four classifier types, to create basic versions of their resulting products:
Support Vector Machine: libSVM (in the e1071 package)27
Random Forest: randomForest28
Neural Network: nnet29
Naïve Bayes: e107127
This was a simple, exploratory exercise, to analyze the results from a variety of classifiers. In addition to the training data, we added the following parameters to each classifier instance:
SVM: A cost value of 100; Gamma value of 1.
Random Forest: Importance of predictors assessed; 2000 trees produced.
Neural Network: Five neurons in a single hidden layer; initial weight value of 0.1; weight decay of 0.00005; 200 maximum iterations.
Naïve Bayes: (all application default settings implemented)
Otherwise, the basic default parameters provided by each application were used.
Results
Clustering Analysis
Of the six concept pairs, all but one exhibited two clusters of similar or equal proportions, in terms of data elements (Table 2) when clustering data without their concept labels. There were three covariance models, and individual cluster characteristics identified for all datasets. All clusters were ellipsoidal, but some otherwise exhibited variation.
Table 2.
Clustering results.
| Dataset | Cluster Count | Cluster ratios | Covariance Model | Cluster Characteristics |
|---|---|---|---|---|
| Abortion - Miscarriage | 5 | 25/70/64/36/41 | VEV | ellipsoidal, equal shape |
| Congestive Heart Failure – Pulmonary Edema | 2 | 124/116 | VVV | ellipsoidal, varying volume, shape, and orientation |
| Delusion – Schizophrenia | 2 | 120/120 | VVV | ellipsoidal, varying volume, shape, and orientation |
| Dementia – Delirium | 2 | 113/127 | VVV | ellipsoidal, varying volume, shape, and orientation |
| Heart - Myocardium | 2 | 120/120 | VVE | ellipsoidal, equal orientation |
| Stroke - Infarct | 2 | 121/119 | VEV | ellipsoidal, equal shape |
Two-dimensional cluster plot images of concept instances and predicates further illustrate cluster count and characteristics (Figure 1):
Figure 1.
Cluster plots of concept instances and predicates
For the five pairs exhibiting 2 clusters, the cluster ratios match or closely follow the 50 / 50 ratio of data instances (i.e., 120 instances for each concept) with little overlap where the ratio deviates from an evenly divided split. The best model for the dataset for Abortion and Miscarriage contained five clusters of varying proportions.
When we restored the labels to the data, and subjected it to an additional predictive process implementing Eigenvalue Decomposition Discriminant Analysis30, where each known class is modeled with the same covariance for each class, the results changed. The same covariance models resulting from the unlabeled clustering exercise were applied. The purpose of this was to observe which labels were associated with which cluster in each pair, when matched against possible classifications, although other outcomes from this modified process are also of interest. Abortion and Miscarriage are now represented by two clusters (because there are two classes represented in the data), and there is slightly reduced overlap for two of the concept pairs (see Figure 2).
Figure 2.
Results of clustering with labels included
Classifier Training
While providing a basic, popular package for neural networks, nnet does not support multiple hidden layers. Experimentation revealed that five neurons in the single hidden layer produced the best results. Other limited experimentations enabled us to do some basic tuning to the other types of classifiers using the parameters already noted.
Each classifier yielded different composite accuracy scores, measured by the percentage of classifications in the testing data correctly predicted (Table 3). All but the Naïve Bayes classifier’s accuracy exceeded 80%.
Table 3.
Classifier Accuracy
| Classifier | Accuracy |
|---|---|
| SVM | 0.89 |
| Random Forest | 0.83 |
| Neural Network | 0.85 |
| Naïve Bayes | 0.72 |
Precision, recall, and F1 scores for classifying individual concepts are indicated in Table 4.
Table 4.
Precision (P), recall (R), and F1 scores from classifier performance for individual concepts.
| SVM | Random Forest | Neural Network | Naïve Bayes | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| P | R | F1 | P | R | F1 | P | R | F1 | P | R | F1 | |
| Miscarriage | 0.90 | 0.70 | 0.79 | 0.84 | 0.57 | 0.68 | 0.79 | 0.62 | 0.69 | 0.62 | 0.41 | |
| Delirium | 0.83 | 0.81 | 0.82 | 0.73 | 0.71 | 0.72 | 0.69 | 0.71 | 0.70 | 0.65 | 0.35 | 0.45 |
| CHF | 0.89 | 0.74 | 0.81 | 0.59 | 0.74 | 0.66 | 0.72 | 0.57 | 0.64 | 0.52 | 0.57 | 0.54 |
| Delusion | 0.74 | 0.82 | 0.78 | 0.67 | 0.71 | 0.69 | 0.74 | 0.82 | 0.78 | 0.56 | 0.64 | 0.60 |
| Pulmonary Edema | 0.63 | 0.77 | 0.69 | 0.63 | 0.77 | 0.69 | 0.68 | 0.86 | 0.76 | 0.40 | 0.73 | 0.52 |
| Abortion | 0.90 | 0.93 | 0.91 | 0.87 | 0.90 | 0.88 | 0.93 | 0.86 | 0.89 | 0.77 | 0.79 | 0.78 |
| Infarct | 0.92 | 1 | 0.96 | 0.92 | 0.92 | 0.92 | 0.88 | 0.96 | 0.92 | 0.81 | 0.92 | 0.86 |
| Myocardium | 0.97 | 0.97 | 0.97 | 0.93 | 0.90 | 0.91 | 0.91 | 0.97 | 0.94 | 0.93 | 0.81 | 0.87 |
| Dementia | 0.91 | 1 | 0.95 | 0.91 | 0.94 | 0.92 | 0.94 | 0.97 | 0.95 | 0.83 | 0.81 | 0.82 |
| Schizophrenia | 1 | 0.95 | 0.97 | 1 | 0.89 | 0.94 | 0.95 | 1 | 0.97 | 0.75 | 0.95 | 0.84 |
| Stroke | 1 | 1 | 1 | 0.96 | 1 | 0.98 | 1 | 0.96 | 0.98 | 0.95 | 0.91 | 0.93 |
| Heart | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Discussion
This exercise demonstrates that people communicate differently about similar topics, even if (as in the case of Dementia and Delirium) that communication may share a substantial portion of contextual semantic elements. That difference can be found in analyzing data that captures how semantic elements interact.
In the first clustering exercise, 80% of the clustered concept pair datasets gave an affirmative answer to the study’s first question “Do unlabeled datasets of semantically related concept pairs exhibit two distinct clusters of equal or similar proportions when subjected to BIC modeling and EM clustering?”. Five of the six pairs identified two distinct clusters of proportions matching or similar to the count of instances for each concept. The cluster not exhibiting this behavior contained data for the Abortion-Miscarriage data. As mentioned earlier, the abortion data contained 116 instances, (as compared to 120 instances for all other concepts). The effect of this on the results is unknown, although the difference in the instance counts is quite small. When restricted to two clusters (there being two classes), modeling with the same covariance matrix, the Abortion-Miscarriage data behaved similarly to the other pairs.
Overall, the classifier training and testing exercise provided an affirmative answer to the question “Could a dataset for several concepts, where there is matching data for another semantically related concept for each concept in the dataset, be applied to training several different types of classifiers?”. Training multiple types of classifiers provides potential expectation of results for similar exercises using this type of data. The Support Vector Machine yielded the best overall accuracy. With the exception of Pulmonary Edema, F1 scores produced by the SVM for concept classification match or exceed those of the other types of classifiers, including delirium (0.82) and dementia (0.95). The Naïve Bayes classifier provided the overall lowest accuracy and F1 scores.
We compared these results to those of a similar task, WSD, where the researchers applied a new Expectation Maximization-based algorithm to determine the exact sense of the term “cold”3. Their top-performing model achieved 0.89 accuracy, which was comparable to the SVM we trained and most of the other state-of-the-art systems described in their work (the highest achieved being 0.93 by a Naïve Bayes application). In our similar task, we differentiated the sense of several terms, with the SVM, achieving F1 scores ranging from 0.69 to 1, with an average score of 0.89.
Implications in Future Work
Understanding differences in how individuals communicate about dementia and delirium may shed further light on distinguishing the two terms in text. To understand the results further, we reviewed instances of feature data for the 10 years, and found there is more variety in the predicate-opposite argument combinations in delirium as compared to dementia. Combinations that individually occur less than ten times comprise 37.7% of the delirium data, as compared to 18% for dementia. This means that, in the biomedical literature, authors communicate about delirium in more ways than dementia in terms of predicates and individual argument combinations. For predicate-semantic type combinations, there was much less variation, whereas combinations occurring less than ten times accounted for 1.4% and 4.5% of dementia’s and delirium’s data, respectively, suggesting that authors are more likely to communicate in more established predicate-semantic type patterns. These type of communication differences likely affected data behavior in the clustering and classifying exercises.
The data were drawn from the biomedical literature. Would similar data from clinical text yield the same results? There have been efforts to extract relational data from clinical text31, 32. A proficient software application that extracts semantic predications from text may assist in answering this question. We are currently developing an application that extracts semantic predications from clinical text. Data with definitive instances of events such as delirium and dementia could be used to extract the initial semantic predications, to build a database of these data structures drawn from clinical text. This could be accomplished by using clinical text with a high certainty of such events.
A variation of sense differentiation in clinical text could address content where the author is uncertain of the event that is taking place, or if the resulting clinical text suggests this. For example, if a clinical note implied uncertainty regarding whether a patient had dementia or delirium, perhaps it could be determined by the way the author recorded it. Semantic predications extracted from clinical text where there is a high certainty of the events as recorded (as noted before) could be used to train an SVM classifier that in turn could be used to analyze data drawn from clinical text exhibiting less certainty. For the semantic predications extracted from the latter text, labels addressing the concepts of interest would have a generic indication, such as “Condition X”. For example, instances of dementia and delirium, where certainty of either is small, could be relabeled in this manner. Such work could reinforce an extended viewpoint on how word sense is identified, by taking into consideration the certainty with which a diagnosis or other assertion is recorded. It could complement and possibly incorporate research addressing the subjective properties of assertions in text33.
A specific application could assess an author’s certainty in documenting diagnoses and other narratives addressing biomedical phenomena. For example, an application could alert a clinician discussing a diagnosis of Disease X that he or she is communicating about it in a manner consistent with Disease Y. This could be especially useful for diagnostically problematic conditions such as delirium. Application performance could be determined by processing relevant text, and interviewing the author to assess certainty in the recorded phenomenon.
In this study, we used semantic interaction rate data drawn from semantic predications to explicate the different senses of several semantically related pairs. There may be other NLP clinical text applications for which this type of data may be useful. It could drive temporal information retrieval34 by identifying documents with a high concentration of a given concept and frequent interactions with other semantic elements within a timeframe of interest. It could also be implemented in semi-supervised learning tasks, requiring temporal concept data, such as certain approaches to coreference resolution35.
Limitations
The classifier exercise utilized a dataset limited to 12 concepts. However, it was intended to be an exploratory exercise implementing data of semantically related concepts, which was accomplished with this dataset.
This is an elementary exploration of analyzing interactions among semantic elements in text. We used a simple form of frequency data to capture semantic element interaction characteristics. It included temporal counts, but not other artifacts, such as, the most common predicate (e.g., PREVENTS) or opposite argument (e.g., calcium channel blocker), or other element for the given time period. Different expressions of semantic element interaction data may provide deeper insight to this phenomenon in text.
Factoring in SemRep’s performance in accurately identifying semantic relationships in the original text is outside of the scope of this paper, but would be an interesting subject for a future study addressing that topic.
Conclusion
We applied interaction rate data extracted from semantic predications to clustering and classification exercises to determine if this data displayed expected clustering behavior, and could be used to classify concept instances where there were instances of several semantically related concept pairs in the dataset. Clustering of unlabeled data demonstrated two clusters of equal or similar proportions for five of the six concept pairs. When labels were included, all concept pairs demonstrated two clusters of equal or similar proportions. Four different classifiers were trained and tested. Accuracy for all classifiers ranged from 0.72 to 0.89, with a Support Vector Machine achieving the highest results. The outcomes for all these exercises pose interesting questions and potential applications in sense differentiation and other NLP tasks.
Acknowledgements
We wish to thank Guy Divita for his contributions to this paper. This work was funded by the U.S. Department of Veterans Affairs, Health Services Research & Development (HSR&D), Project ID: CRE 12-321. This work was also supported in part by the intramural research program at the U.S. National Library of Medicine, National Institutes of Health.
References
- 1.Ide N, Véronis J. Introduction to the special issue on word sense disambiguation: the state of the art. Computational linguistics. 1998;24(1):2–40. [Google Scholar]
- 2.Abed SA, Tiun S, Omar N. Harmony Search Algorithm for Word Sense Disambiguation. PLoS One. 2015;10(9):e0136614. doi: 10.1371/journal.pone.0136614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jimeno Yepes A, Berlanga R. Knowledge based word-concept model estimation and refinement for biomedical text mining. J Biomed Inform. 2015;(53):300–7. doi: 10.1016/j.jbi.2014.11.015. [DOI] [PubMed] [Google Scholar]
- 4.Moon S, McInnes B, Melton GB. Challenges and practical approaches with word sense disambiguation of acronyms and abbreviations in the clinical domain. Healthc Inform Res. 2015;21(1):35–42. doi: 10.4258/hir.2015.21.1.35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Association AP. 5th. Arlington, VA: American Psychiatric Publishing; Diagnostic and statistical manual of mental disorders; p. 2013. [Google Scholar]
- 6.Inouye SK. Delirium in hospitalized older patients: recognition and risk factors. J Geriatr Psychiatry Neurol. 1998;11(3):118–25. 57–8. doi: 10.1177/089198879801100302. discussion. [DOI] [PubMed] [Google Scholar]
- 7.Inouye SK. Delirium in older persons. N Engl J Med. 2006;354(11):1157–65. doi: 10.1056/NEJMra052321. [DOI] [PubMed] [Google Scholar]
- 8.Carr FM. The role of sitters in delirium: an update. Can Geriatr J. 2013;16(1):22–36. doi: 10.5770/cgj.16.29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hope C, Estrada N, Weir C, Teng CC, Damal K, Sauer BC. Documentation of delirium in the VA electronic health record. BMC Res Notes. 2014;(7) doi: 10.1186/1756-0500-7-208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Spuhl J, Doing-Harris K, Nelson S, Estrada N, Del Fiol G, Weir C. Concordance of Electronic Health Record (EHR) Data Describing Delirium at a VA Hospital. AMIA Annu Symp Proc. 2014;201(4):1066–71. [PMC free article] [PubMed] [Google Scholar]
- 11.Keenan G, Yakel E, Dunn Lopez K, Tschannen D, Ford YB. Challenges to nurses’ efforts of retrieving, documenting, and communicating patient care information. J Am Med Inform Assoc. 2013;20(2):245–51. doi: 10.1136/amiajnl-2012-000894. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Volland J, Fisher A, Drexler D. Delirium and Dementia in the Intensive Care Unit: Increasing Awareness for Decreasing Risk, Improving Outcomes, and Family Engagement. Dimens Crit Care Nurs. 2015;34(5):259–64. doi: 10.1097/DCC.0000000000000133. [DOI] [PubMed] [Google Scholar]
- 13.Rigney TS. Delirium in the hospitalized elder and recommendations for practice. Geriatr Nurs. 2006;27(3):151–7. doi: 10.1016/j.gerinurse.2006.03.014. [DOI] [PubMed] [Google Scholar]
- 14.Mendez MF. The accurate diagnosis of early-onset dementia. Int J Psychiatry Med. 2006;36(4):401–12. doi: 10.2190/Q6J4-R143-P630-KW41. [DOI] [PubMed] [Google Scholar]
- 15.Rindflesch TC, Fiszman M. The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text. J Biomed Inform. 2003;36(6):462–77. doi: 10.1016/j.jbi.2003.11.003. [DOI] [PubMed] [Google Scholar]
- 16.Saravanaraman P, Chinnadurai RK, Boopathy R. Why calcium channel blockers could be an elite choice in the treatment of Alzheimer’s disease: a comprehensive review of evidences. Rev Neurosci. 2014;25(2):231–46. doi: 10.1515/revneuro-2013-0056. [DOI] [PubMed] [Google Scholar]
- 17.Kilicoglu H, Shin D, Fiszman M, Rosemblat G, Rindflesch TC. SemMedDB: a PubMed-scale repository of biomedical semantic predications. Bioinformatics. 2012;28(23):3158–60. doi: 10.1093/bioinformatics/bts591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. Journal of the royal statistical society Series B (methodological) 1977:1–38. [Google Scholar]
- 19.Bouguila N. Count data modeling and classification using finite mixtures of distributions. Neural Networks. IEEE Transactions on. 2011;22(2):186–98. doi: 10.1109/TNN.2010.2091428. [DOI] [PubMed] [Google Scholar]
- 20.Lippincott T, Passonneau R. Computational Linguistics and Intelligent Text Processing. Springer; 2009. Semantic clustering for a functional text classification task; pp. 509–22. [Google Scholar]
- 21.Xu H, Stetson PD, Friedman C. Methods for building sense inventories of abbreviations in clinical notes. J Am Med Inform Assoc. 2009;16(1):103–8. doi: 10.1197/jamia.M2927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Pedersen T, Pakhomov SV, Patwardhan S, Chute CG. Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform. 2007;40(3):288–99. doi: 10.1016/j.jbi.2006.06.004. [DOI] [PubMed] [Google Scholar]
- 23.Fraley C, Raftery AE. How many clusters? Which clustering method? Answers via model-based cluster analysis. The computer journal. 1998;41(8):578–88. [Google Scholar]
- 24.Fraley C, Raftery AE, Scrucca L, Murphy TB, Fop M. Package MCLUST; 2015. [Google Scholar]
- 25.Ihaka R, Gentleman R. R. a language for data analysis and graphics. Journal of computational and graphical statistics. 1996;5(3):299–314. [Google Scholar]
- 26.Navigli R. Word sense disambiguation: A survey. ACM Computing Surveys (CSUR) 2009;41(2) [Google Scholar]
- 27.Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F, Chang C-C, et al. Package ‘e1071’; 2015. [Google Scholar]
- 28.Breiman L, Cutler A, Liaw A, Wiener M. Package randomForest; 2015. [Google Scholar]
- 29.Ripley B, Venables W. Package nnet; 2015. [Google Scholar]
- 30.Bensmail H, Celeux G. Regularized Gaussian discriminant analysis through eigenvalue decomposition. Journal of the American statistical Association. 1996;91(436):1743–8. [Google Scholar]
- 31.Rink B, Harabagiu S, Roberts K. Automatic extraction of relations between medical concepts in clinical texts. J Am Med Inform Assoc. 2011;18(5):594–600. doi: 10.1136/amiajnl-2011-000153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Minard AL, Ligozat AL, Ben Abacha A, Bernhard D, Cartoni B, Deleger L, et al. Hybrid methods for improving information access in clinical documents: concept, assertion, and relation identification. J Am Med Inform Assoc. 2011;18(5):588–93. doi: 10.1136/amiajnl-2011-000154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Kilicoglu H, Bergler S. Recognizing speculative language in biomedical research articles: a linguistically motivated perspective. BMC Bioinformatics. 2008;9(11):S10. doi: 10.1186/1471-2105-9-S11-S10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Alonso O, Strötgen J, Baeza-Yates RA, Gertz M. Temporal Information Retrieval: Challenges and Opportunities. TWAW. 2011;11:1–8. [Google Scholar]
- 35.Raghavan P, Fosler-Lussier E, Lai AM, editors. Exploring semi-supervised coreference resolution of medical concepts using semantic and temporal features.. Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; Association for Computational Linguistics.2012. [Google Scholar]


