Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews

Cheng Ye; Daniel Fabbri

doi:10.1016/j.jbi.2018.05.014

. Author manuscript; available in PMC: 2019 Jul 1.

Published in final edited form as: J Biomed Inform. 2018 May 22;83:63–72. doi: 10.1016/j.jbi.2018.05.014

Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews

Cheng Ye ¹, Daniel Fabbri ^1,²

PMCID: PMC6047916 NIHMSID: NIHMS973247 PMID: 29793071

Abstract

Objective:

Word embeddings project semantically similar terms into nearby points in a vector space. When trained on clinical text, these embeddings can be leveraged to improve keyword search and text highlighting. In this paper, we present methods to refine the selection process of similar terms from multiple EMR-based word embeddings, and evaluate their performance quantitatively and qualitatively across multiple chart review tasks.

Materials and Methods:

Word embeddings were trained on each clinical note type in an EMR. These embeddings were then combined, weighted, and truncated to select a refined set of similar terms to be used in keyword search and text highlighting. To evaluate their quality, we measured the similar terms’ information retrieval (IR) performance using precision-at-K (P@5, P@10), Additionally a user study evaluated users’ search term preferences, while a timing study measured the time to answer a question from a clinical chart.

Results:

The refined terms outperformed the baseline method’s information retrieval performance (e.g., increasing the average P@5 from 0.48 to 0.60), Additionally, the refined terms were preferred by most users, and reduced the average time to answer a question.

Conclusions:

Clinical information can be more quickly retrieved and synthesized when using semantically similar term from multiple embeddings.

Keywords: electronic medical records (EMR), search engines, query expansion, highlighting, clinical similar terms, semantic embeddings

BACKGROUND AND SIGNIFICANCE

Electronic Medical Records (EMRs) [1–3] contain detailed, unstructured text describing medical conditions. As the size and complexity of EMR systems grow [4,5], tools are needed to help clinicians and researchers efficiently find relevant information within clinical notes. Two popular methods for clinical information retrieval and consumption are search engines and text highlighting.

Search engines have been widely used to help users retrieve information from medical charts [6–11], Some EMR search engines apply the query expansion method [12–14] that takes the original search term, expands it into multiple terms, and returns documents containing any of the expanded terms. Similarly, text-highlighting systems highlight text within a note that contain a search term or similar terms to quickly focus the reviewer on important text [6–11].

Underpinning both tools is the need to extract similar terms for a given keyword. Two popular ways to produce clinical similar terms are [i] ontologies, such as SN0MED-CT[15], UMLS[16], and [ii] EMR-based semantic embeddings. While clinical ontologies are hard to construct and update, EMR-based semantic embeddings are trained using unsupervised machine learning methods (e.g., GloVe[17], word2vec [18]) on EMR text and identify similar terms based on the EMRs’ semantics. For example, Pakhomov et al. [19] found that word embeddings capture semantic relatedness between medical terms. Moreover, Zhu et al.[20] and Hanauer et al. [21] showed that semantically-based query recommendation systems can effectively expand search queries.

However, little research has analyzed the quality or impact of combining multiple semantic embeddings for chart review tasks. In this paper, we present the EMR-subsets method to identify similar terms from multiple embeddings, and evaluate the quality and quantity of similar terms across various clinical chart review tasks.

To evaluate the identified similar terms across quantitative and qualitative dimensions, we conduct multiple experiments including an information retrieval evaluation, a user preference study and a timed chart review task The results show that the identified similar terms achieved better IR performance than the baseline methods, were preferred by most users, and reduced the time to answer a question significantly. Moreover, the selection method is able to identify an optimal number of similar terms.

This work differs from previous work in two critical ways: (1) the EMR-subsets method extracts similar terms by combining multiple EMR-based word embeddings; and (2) is evaluated across multiple dimensions including information retrieval performance, user preference and time to answer a question from a chart.

MATERIALS AND METHODS

Semantic Embeddings

A semantic embedding, such as the word2vec, projects words into a vector space by training a neural network with text [22,23], Word2vec embeddings can be trained with two different methods, the Continuous Bag-of-Words [CBOW] method and the skip-gram method (using a set of words vs. the position of words, respectively). Researchers have already applied the word2vec embeddings to support clinical chart reviews, such as with query expansion[13] and search [24], In this paper, we use the CBOW method for training, which is the default training algorithm of a word2vec model in Gensim[25], The positions of words in the learned embedded vector space are used to estimate their similarity. Specifically, we measure the similarity of two words wordi and wordj using the cosine similarity of their embedded vectors v_t and Vj. The range of similarity is from zero to one.

s (w o r d_{i}, w o r d_{j}) = \frac{v_{i} \cdot v_{j}}{∥ v_{i} ∥ \times ∥ v_{j} ∥}

Table 1 lists the documents used to train our embeddings. The "Complete EMR" data set refers to all clinical notes from the Vanderbilt University Medical Center Synthetic Derivative [26], a de-identified mirror of the EMR, which contained approximately 100 million clinical notes at the time of this study.

Table 1.

Data sets used tor training semantic embeddings. Vocabulary size is the number ot distinct words in the data set appearing at least 50 times.

Training Data Set	Note Count	Vocabulary Size
Complete EMR	100m	277k
Clinical Communication	19.2m	67.0k
HP	8.0m	24.1k
Outpatient rx Order Summary	5.0m	16.2k
Prescription	4.0m	17.1k
Problem List	3.1m	6.4k
Provider Communications	2.6m	12.2k
Clinic Note	2.4m	33.2k
Respiratory Care	2.2m	3.4k
Clinic Summary	2.2m	14.7k
Clinic Summary 2	2.1m	28.2k
Rehab	1.7m	31.5k
Nurse’s Note	1.4m	16.9k
Emergency Department Nurse's Triage	1.0m	19.3k
Letter	1.0m	26.2k

Open in a new tab

The other data sets are the largest 14 subsets of the EMR, each containing at least 1 million notes. For each data set, we trained a word2vec model with the default parameters using the implementation provided by Gensim [25], a Python library for semantic analysis. We name each embedding with the name of its training data set, and we call any embedding trained with a subset of the EMR system an “EMR-subset embedding.”

In addition to the EMR-based embeddings, we downloaded the pre-trained word2vec model from Google News (which we refer to as the News embeddings) [22], which contains 3 million word vectors in a 300-dimension vector space, as one of the baseline word embeddings. The News embedding has been used in prior work for query expansion[27] and identifying similar terms [18].

The preprocessing transformations applied before training include:

(1)
Parsing XML and HTML data formats to plain text using Beautiful Soup [28].
(2)
Excluding stop words, words with a length less than two characters, and words with a frequency less than ten in the training data set.
(3)
Tokenizing the words using the Gensim[25] word tokenizer and lowercasing all words.

It is important to note that all EMR-based embeddings in this work utilize one-grams for training, while the Google News embedding includes bi-grams.

Similar Terms Extraction

In this section, we describe our EMR-subsets method to extract and merge similar terms from multiple EMR-subset embeddings. The approach is motivated by the observation that embeddings created from the entire EMR can be distorted by frequently occurring text. Instead, terms should be similar to the keyword throughout subsets of the EMR. For example, as shown in Table-A in the appendix, the “Rehab” EMR-subset embedding identifies “ca” as a top-10 similar term for “cancer.” Similarly, another EMR-subset embedding identifies “grandfather” as a similar word to “cancer”, likely because physicians document family history (We queried the complete EMR and found that 27% of the documents that contain “cancer” also contain “grandfather”, and that many of these were within five words of each other). However, the term “cancer” is not similar to “ca” and “grandfather” in other subsets, indicating these similar terms might be biased by the text in the subset, and therefore may not be ideal for searching or highlighting clinical documents.

The EMR-subsets method identifies similar terms of a given keyword w that have consistent similarity values across EMR subsets. As shown in Figure 1, three metrics are calculated to determine a similarity score for the EMR-subsets method. The intra-subset similarity is a term’s similarity to the keyword w using a specific subset’s embedding. The inter-subsets similarity is a term’s average similarity to the keyword w in all other subsets’ embeddings. The harmonic similarity is the harmonic mean between the intra-subset and intersubsets similarities, which is maximized when the two similarities are equal and is zero if a term exists in a single subset.

Figure 1. — Similar terms of “cancer” from the “Clinic Note” EMR-subset embedding broken down by intrasubset similarity, inter-subsets similarity, and harmonic similarity. The harmonic similarity is used for ranking terms.

Extracting similar terms from the EMR-subset embeddings requires multiple steps.

(1)
Candidate Term Generation and Intra-Subset Similarity: For a given keyword w and an EMR-subset embedding (e.g., the “Clinic Note” embedding), we generate the top-K similar terms of the keyword w. The similarities of these terms define the intra-subset similarities. The first column in Figure 1 lists the similar terms from the “Clinic Note” EMR-subset embedding for “cancer” including family history terms (e.g., grandfather), misspellings (e.g., caner) and organs (e.g., colon).
(2)
Inter-Subsets Similarity: For each candidate term t in each subset, we compute its average similarity to the keyword w (i.e. inter-subsets similarity) based on other EMR-subset embeddings (i.e., excluding the “Clinic Note” EMR-subset embedding). A candidate term that does not exist in some embeddings has a similarity of zero and lowers the inter-subsets similarity. If a candidate term only exists in the current EMR-subset embedding, we set its inter-subsets similarity to a minimum value (e.g., 0.001). The second column in Figure 1 lists those terms’ similarities to cancer across the other subsets - we observe that grandfathers has a lower similarity in other subsets, while melanoma is more similar.
(3)
Harmonic Similarity: For each candidate term t in each subset, we compute the harmonic mean of its intra-subset similarity and inter-subsets similarity. As shown in Figure 1, the inter-subsets similarity of “cancer” and “grandfathers” is 0.21, which is much lower than its intra-subset similarity. Therefore, “cancer” and “grandfathers” is only similar to each other in “clinic note” embedding, meaning it is unlikely to be included the similar term list.
(4)
Term Cutoff: For each subset, we apply the similarity-based cutoff method (described in detail below) to remove candidate terms with low harmonic similarities. As shown in Figure 1 in red, we remove some of the family terms, such as “grandfathers” and “great-grandfather,” using the similarity cutoff 0.33.
(5)
Merge Similar Terms: Repeat step (l)-(4) in each subset embedding and merge the similar terms by merging the similar terms extracted from each EMR-subset word embedding. Table (A) in the appendix shows the top 10 similar terms of “cancer” generated by the EMR-subsets similarity algorithm.

Formally, we present the process of extracting similar terms from a list of EMR-subset embeddings M = {M₁, M₂ … M_m} for a keyword w (i.e., there are m embeddings in the list, one for each note type). Given an EMR-subset embedding M_j, we define the intra-subset similarity of two words as S_j (w₁, w₂), and the inter-subsets similarity of two words as I_j (w₁, w₂).

For each EMR-subset embedding M_j, we generate the top-K similar terms of the keyword w as the candidate terms. We then compute the inter-subsets similarity of each candidate term:

I_{j} (t, w) = \frac{\sum_{k = 1, k \neq j}^{m} S_{k} (t, w)}{m - 1}

We then compute the harmonic similarity of each candidate term t:

E_{j} (t, w) = 2 \times \frac{S_{j} (t, w) \times I_{j} (t, w)}{S_{j} (t, w) + I_{j} (t, w)}

Next, we remove low similarity terms provided by each EMR-subset embedding M_j, since the number of similar terms impacts the quality of search and highlighting. For example, Figure 2 shows that as the list of search terms is expanded from [epilepsy] to include additional terms, the relevance of the retrieved documents increases initially, but then decreases as the list grows (here, relevance is defined as the percentage of highly similar terms from documents in the expanded search result).

Figure 2. — Example of expanded document quality analysis for “epilepsy.” The proportion of high similarity terms (i.e. terms that have similarities larger than 0.60 while 1.0 is the maximum value) decreases with similar term expansion.

Cutoff Method: The method to determine the similarity cutoff is outlined as follows. We represent the similar terms of a keyword as a two-dimension curve L (Figure 3), with the similar terms along the x-axis (represented by their indexes) sorted by the harmonic similarity in descending order, and their similarity values along the y-axis. We define the cutoff point as the “elbow” of the curve L because the benefit of adding more terms after this point is lower than the average benefit of choosing all terms. Formally, a cutoff point has a smoothed derivative equal to the slope of the line ℓ joining the endpoints of L. Because there are irregularities in the curve L that produce multiple points with a derivative that matches the slope of ℓ, we use an approximate method to identify a unique cutoff point in the curve L:

(1)
Draw a line ℓ between the endpoints of L.
(2)
Calculate the minimum distance from each point in the curve L to the line ℓ.
(3)
Choose the point that has the maximum distance to the line ℓ as the cutoff point. The derivative of L at this point equals the slope of ℓ, by the fundamental theorem of calculus.

Figure 3. — Example of similarity cutoff computation. Since all terms have similarities larger than 0.40, the y-axis starts from 0.3. Similarity cutoff is at the “elbow” of the similarity curve (arrow).

Finally, we merge the similar terms extracted from the EMR-subsets embeddings as the final similar term list for the keyword w.

The results of the “elbow method” are dependent on the number of terms (i.e., the K) chosen. It is true that different K values impact the curve and result in different cutoff values. In fact, the elbow method can be used to choose the best K value. Given the search terms “diabetes” and “seizure,” we tested different values for K (from 1 to 1000) and the elbow method identified K = 100. Larger values of K did not improve results. The K value may vary for different search terms.

Baseline Methods

To compare the EMR-subsets method’s similar terms, we use three baseline methods: (i) terms from the Complete EMR embedding, (ii) terms from the News embeddings, and (iii) terms from the combined Complete EMR and News embedding, which is defined as follows. For a keyword w, the EMR-News similar terms are terms that are similar to the keyword w in both the EMR and News embeddings. We extracted EMR-News similar terms in three steps:

(1)
Extract the top-K similar terms of the keyword w from the EMR and News embedding. Then use the intersection of those similar terms as the candidate terms.
(2)
For each candidate term t, compute its EMR-News harmonic similarity to the keyword w, which is the weighted harmonic mean of the similarity S_c provided by the EMR embedding and the similarity S_g provided by the News embedding:
$E M R - N e w s (w, t_{j}) = 2 \times \frac{S_{g} (w, t) \times S_{c} (w, t)}{S_{g} (w, t) + S_{c} (w, t)}$
(3)
Sort all candidate terms by their EMR-News harmonic similarities and use the cutoff method to select a similar term list. If and only if a candidate term t has high similarity to the keyword w in both the EMR and News embeddings, the candidate term t has a high EMR-News harmonic similarity.

Evaluations

User Preference Study

We designed a user preference study to evaluate whether the extracted similar terms are preferred by users with different medical knowledge levels in various chart review scenarios. We compared the selections of similar terms provided by the EMR-subsets method and the three baseline methods. As shown in Table 2, we recruited 11 Vanderbilt University Medical Center medical doctors (MDs) at the level of residency training or above, and 20 Non-MD Amazon Mechanical Turk [29] workers in the United States. Only the MDs have verified clinical knowledge. We chose fourteen keywords (each of which was categorized as a general or clinical term), and asked users to choose the best list of similar terms for each keyword.

Table 2.

Framework of the user preference study.

(a) User types
Name	Knowledge Level	Size
MD	Medical Doctor Level	11
Non-MD	No Verified Level	20

(b) Medical note review tasks

Type	Keyword
	Advil
	Cancer
	Fracture
General	Kidney
	Ventilator
	Walking

	Cefuroxime
	EEG
	Epilepsy
Clinical	Irrigate
	Keppra
	Pruritis
	Rhinorrhea

(c) Similar Terms

Source	Abbreviation
EMR word embedding	EMR
News word embedding	News
EMR and News word embeddings	EMR-News
EMR-subset word embeddings	EMR-subsets

Open in a new tab

Figure 4 shows the web page for the user preference study, which contains 14 questions asking participants to choose their preferred similar term list in a chart review task.

We applied multinomial logistic regression [30,31] to analyze users’ preferences of similar terms across the extraction methods. As shown below, each logistic model takes user type (0-MD, 1-Non-MD) and task type (0-Clinical, 1-General) as the input, and outputs the log-odd ratio of choosing one method over the reference method. The null hypotheses are: [1] The user type and task type have no effect on the selection of similar terms; [2] There is no significant preference among the similar terms provided by the EMR-subsets method and the baseline methods.

I n (\frac{P (M e t h o d)}{P (B a s e l i n e M e t h o d)}) = I n t e r c e p t + C o e f f i c i e n t_{u} * U s e r T y p e + C o e f f i c i e n t_{t} * T a s k T y p e

Information Retrieval Experiments

To evaluate the information retrieval [IR] performance of the EMR-subsets method, as shown in Table 3, we selected nine search terms from Table-2, including eight single-word search terms and one multiple-words search term. For each search term, we randomly selected 60 documents from patient cohorts defined by a specific ICD-9 code (Table 3) in which some documents contain the search term (referred as the exact-match subset) and others do not (referred as the non-exact match subset). Then we asked three medical researchers (referred to as users 1, 2, 3) to label each note’s relevance to the search term (1-relevant, 2-partially relevant or 3-irrelevant).

Table 3.

Information Retrieval Performance Evaluation Data Sets

Search Term	ICD-9 code	Type	Number of Exact-Match Documents	Number of Non-Exact Match Documents
Breast Cancer	174.9	General	40	20
Epilepsy	345.9	Clinical	37	23
Fracture	829.0	General	38	22
Headache	784.0	General	30	30
Kidney	593.9	General	34	26
Pruritus	698.9	Clinical	26	34
Respiration	786.52	Clinical	36	24
Rhinorrhea	478.19	Clinical	31	29
Walking	719.7	General	36	24

Open in a new tab

Next, for each search term and extraction method, we evaluated the P@5 and P@10 scores for the various methods, in which precision-at-K (P@K) is defined as the number of relevant or partially relevant notes in the top-K ranked notes. Notes are ranked proportionally to the number and weight of similar terms in a note. The formal equation is as follows for a keyword w and terms in a note.

R a n k (N o t e) = \sum_{t \in N o t e} S (w, t)

Elbow Method Evaluation

To evaluate the elbow method, we randomly identified 300 notes from patients in the EMR system that have an ICD-9 code for “seizure” (780.39), and another 300 notes from patients with an ICD-9 code for “diabetes” (250.*). As a result, some notes are relevant to diabetes or seizure care, and some are not. Then we asked four medical researchers to label each note’s relevance to a disease (1-relevant, 2-partially relevant or 3-irrelevant), which produced four labeled document sets for the ‘diabetes’ cohort and four labeled document sets for the ‘seizure’.

Next, for each document set, we used “diabetes” and “seizure” as the initial queries for the respective document sets, expanded the search with the similar terms from the EMR-subsets method and evaluated the impact of the cutoff method by comparing its IR performance to three manually selected cutoff values.

Time Efficiency Experiment

Two medical researchers, who were not investigators of this study, analyzed a cohort of 100 patients (with an average of 75 notes per patient) to determine if a patient had dialysis within 2 weeks of surgery. For each patient, the researchers answered the question YES or NO. For half of the patients, we provided exact keyword search and highlighting to support chart review, in which notes were ranked higher proportionally to the keywords frequency in a note. For the other half of the patients, similar terms were used to expand the search and highlighting feature. In this case, We recorded and compared the time needed to identify the answer for the two methods. Moreover, we compared the results of medical researchers by measuring label accuracy.

RESULTS

User Preference Study

We received 11 MDs' and 20 Non-MDs' response (the response rate is 100%) for a total 441 preferences (i.e., 31 x 14 = 434 + 7 multiple choices). As shown in Table 4, the EMR-subsets method received 52% of the selections, which is more than the other similar term extraction methods. Moreover, the selection of EMR-subsets method varies with user type and task type.

Table 4.

Form (a) records the overall preferences of similar terms extracted from different sources. Form (b) records the count and the percentage of selections of similar terms by User type and task type. Form (c) records the selections of each similar term extraction method.

(a) Preference of Similar Terms
Source	EMR	News	EMR-News	EMR-subsets	Total

Total Selections	44 (9.9%)	129 (29.0%)	39 (8.8%)	229 (52.0%)	441

(b) Similar Terms Selections by User type

Source	EMR	News	EMR-News	EMR-subsets	Total

MD Selections	15 (9.7%)	31(20.0%)	15(9.7%)	93 (60.0%)	154(100%)
Non-MD Selections	29 (10.0%)	98 (34.0%)	24 (8.0%)	136 (47.0%)	287(100%)

(c) Similar Term Selections by Task type

Source	EMR	News	EMR-News	EMR-subsets	Total

Clinical Selections	26 (12.0%)	56 (25.0%)	21 (9.0%)	120 (54.0%)	223(100%)
General Selections	18 (8.0%)	73 (33.0%)	18 (8.0%)	109 (50.0%)	218(100%)

Open in a new tab

We applied multinomial logistic regression models to analyze the result of the user preference study. As shown in Table 5, both the user type and task type have a significant effect on user preference. Based on the intercepts and coefficients of models with indexes 3, 4, 5 in Table 5, we concluded that both the MD and Non-MD users prefer the similar terms provided by the EMR-subsets method compared to other baseline methods, in both the clinical and general tasks.

Table 5.

Analysis of the Impact of user type and task type on the preference of similar terms. User type (MD=0, Non-MD=1) and task type (Clinical=0, General=1) are the inputs of the multinomial logistic regression models. The significance levels are: {**: p-Value < 0.001, *: p-Value < 0.05, one-tailed}.

Index	Logistic Regression Model	Intercept	User type	Task type
1	EMR vs. News	-0.40	-0.51	-0.64
2	EMR-News vs. News	-0.50	-0.69	-0.43
3	EMR-subsets vs. News	1.30**	-0.78*	-0.38
4	EMR-subsets vs. EMR	1.70**	-0.27	0.26
5	EMR-subsets vs. EMR-News	1.80**	-0.09	0.06
6	EMR vs. EMR-News	0.09	0.18	-0.21

Open in a new tab

Information Retrieval Performance

Table 6 shows the average percentage of positive labels (i.e., relevant or partially relevant labels) in the exact match and non-exact match subsets of each evaluation data set. As we can see from Table 6, the non-exact match subsets contain non-negligible amounts of positive documents as the exact match subsets. Therefore, it is important that we develop efficient methods to identify useful documents in the non-exact match subsets.

Table 6.

The Distribution of Positive Labels in the Evaluation Data Sets.

Search Term	Average percentage of positive labeled Exact Match Documents	Average percentage of positive labeled Non-Exact Match Documents
Breast Cancer	68.5%	73.6%
Epilepsy	47.7%	59.4%
Fracture	48.2%	54.5%
Headache	83.3%	65.6%
Kidney	69.6%	71.8%
Pruritus	47.4%	68.6%
Respiration	43.5%	51.4%
Rhinorrhea	52.7%	33.3%
Walking	68.5%	73.6%

Open in a new tab

The P@5 performances for all search terms are shown in Table 7; The P@10 performances for all search terms are shown in Table 8. As we can see from Tables 7–8, adding similar words provided by the EMR-subsets method improves the average P@5 and P@10 results in all evaluation data sets compared to keyword-only search. Moreover, the EMR-subsets method outperforms the other extraction methods. Particularly, the EMR-subsets method significantly outperformances other method in non-exact match subsets, which means the EMR-subsets method provides better similar words.

Table 7.

The Average P@5 Scores of each Similar Word Extraction Methods in different datasets. Onesided Mann-Whitney U test was applied to compare the P@5 scores of EMR-subsets and other methods. Methods that the EMR-subsets method significantly outperformed are marked with ** (p-Value < 0.001).

Data Sets	EMR-subsets	EMR	News	EMR-News	Keywords
Exact & Non-Exact Match	0.60	0.48	0.59	0.55	0.48
Exact Match	0.57	0.48	0.59	0.56	0.48
Non-Exact Match	0.59	0.39**	0.37**	0.41**	0.00

Open in a new tab

Table 8.

The Average P@10 Scores of each Similar Word Extraction Methods in different datasets. Onesided Mann-Whitney U test was applied to compare the P@10 scores of EMR-subsets and other methods. Methods that the EMR-subsets method significantly outperformed are marked with ** (p-Value < 0.001).

Data Sets	EMR-subsets	EMR	News	EMR-News	Keywords
Exact & Non-Exact Match	0.56	0.46	0.50	0.55	0.50
Exact Match	0.53	0.46	0.47	0.57	0.50
Non-Exact Match	0.59	0.32**	0.19**	0.39**	0.00

Open in a new tab

Elbow Method Evaluation

As shown in Table 9, the similarity cutoff method is able to identify an optimal similarity cutoff, which provides a better P@20 score than the manually selected similar cutoffs when using the EMR-subsets method.

Table 9.

The Average P@20 scores of searching “diabetes” and “seizure” with similar words defined by different similarity cutoff

Similarity Cutoff	Average P@20 when searching “diabetes”	Average P@20 when searching “seizure”
1.0	0.64	0.80
0.8	0.64	0.89
0.4	0.61	0.90
0.2	0.54	0.61
Elbow method	0.68	0.94

Open in a new tab

Time Efficiency Analysis

For the note review task, we measured the time to complete each task and the quality of labels produced by the two researchers. Ideally, the researchers would maintain their label accuracy while completing tasks faster.

The result showed that the labels provided by the researchers were highly consistent. The researchers agreed on all documents except one. Table 10 shows the median time and the Interquartile Range (IQR) of time that each researcher spent reviewing notes with or without highlighting similar words of the search query. We used a one-sided Mann-Whitney U test to analyze the difference in average times with and without highlighting similar words. All Mann-Whitney U test provided p-values less than 0.05, which showed that searching and highlighting similar words reduced task time.

DISCUSSION

This paper reports the development and evaluation of a novel similar term extraction method, the EMR-subsets method. The EMR-subsets method utilizes the subsets of an EMR system to extract similar terms that are applicable to support efficient search and consumption of clinical documents. The EMR-subsets method (i) utilized less training data, (ii) received more selections in the user preference study, (iii) achieved higher IR performance than to the baseline methods, and (iv) reduced the time needed to answer questions in a timed chart review task.

Previous research demonstrated that ensemble semantic embeddings provide better similar terms (for example, summing the similarities from multiple semantic spaces [5] or combining vectors from multiple semantic embeddings[18]). However, these methods combined embeddings trained with different data sources, or attempted to learn a global embedding instead of merging the most similar terms from each subset. In this paper, the EMR-subsets method utilized the subsets of a single data set and was preferred by users, while the combination of the EMR and News embeddings was less preferred in the user study.

Interestingly, as shown in the Table-A of Appendix, highly similar terms for “cancer” in the Complete EMR embedding are related to family history, while the similar terms from the News embedding describe types of cancer. In contrast, the EMR-subsets method listed more clinical terms as being similar to “cancer”. One possible reason for this difference is physicians commonly document a patient’s family history of cancer in specific note types. The EMR-subsets method reduces the impact of co-occurring words from a popular note type. Therefore, the community should be careful about incorporating increasingly large data sets when training semantic embeddings for clinical applications.

There are several limitations and possible future work of this study. First, we limited the EMR-subsets method to the largest clinical note types in an EMR system. Future work can consider all note types or subsets constructed in alternative methods such as by common phenotypes[32]. Second, while the study attempted to discern the scenarios in which the News embedding would perform best (i.e., general note review tasks), additional analysis is needed to understand why some users preferred the similar terms provided by the News embedding in some tasks. In addition, a fine-grained information retrieval analysis is needed to determine if positive search preferences correlates with information retrieval performance across many search scenarios (i.e., the preferred similar terms provide better information retrieval performance). Third, the constructions of user types and task types can be formalized and made more fine-grained, for example, by categorizing MD users by discipline or skill. Fourth, we utilized semantic embeddings to identify similar words, while other methods could be used find related terms like graphical models [33], Fifth, we only included unigrams when training EMR-based embeddings in our current study. We did try word embeddings based on bi-grams or tri-grams. However, bi- and tri-gram embeddings needed much more training data and computational resources due to the larger vocabulary space. Moreover, some bi-grams have no clinical meaning, such as “table_also.” One possible future work is extending the vocabulary with bi-grams or tri-grams using a clinical dictionary, such as SNOMED CT or RxNorm. Sixth, we ranked notes by the sum of term similarities. Possible future work includes normalizing the similarities before ranking and introducing other ranking methods. Moreover, as shown in Table 6, many notes contain the search term but were not marked as relevant, which confounds recall evaluations. Therefore, in the information retrieval experiments, we only presented the P@K scores.

CONCLUSION

This paper presents the EMR-subsets method, which extracts similar terms from multiple semantic embeddings trained from subsets of the EMR. We systematically evaluated the similar terms extracted by the approach using qualitative and quantitative methods. Compared to the other baseline methods, the similar terms provided by the EMR-subsets method were preferred in a user preference study, achieved higher P@5 and P@10 scores across multiple search terms, and reduced the time spent searching and consuming clinical information for two researchers in a small pilot study.

Table 10.

The median time (25th and 75th percentile time) medical researchers spent on reviewing one patient’s notes. One-sided Mann-Whitney U test was applied for the analysis. The significance levels are: {**: p-Value < 0.001, *: p-Value < 0.05 one-tailed}.

Researcher	Median time in seconds when reviewing one patient’s notes (25^th and 75^th percentile time) with highlighted similar words	Median time in seconds when reviewing one patient’s notes (25^th and 75^th percentile time) with highlighted exact words
Medical researcher 1	9.0 (8.0 – 11.0)**	11.5 (9.0–26.3)**
Medical researcher 2	76.5 (57.0 – 112.0)*	91.5 (73.5 – 135.0)*

Open in a new tab

Highlights.

Multiple word embeddings have been trained on different types of clinical notes.
A method is presented to extract and merge terms from the word embeddings.
The method was evaluated in terms of user preference and retrieval performance.
The method allows researchers to answer chart review questions more quickly.

Acknowledgments:

The training data for the word2vec semantic embeddings was obtained from VUMC’s Synthetic Derivative, which is supported by institutional funding and by the Vanderbilt CTSA grant ULTR000445 from NCATS/NIH.

Funding: Crowd Sourcing Labels from Electronic Medical Records to Enable Biomedical Research Award Number: 1 UH2 CA203708–01

Appendix

Table-A: the similar terms for “cancer” provided by the EMR-subsets, EMR-News, EMR and News similar term extraction methods and the “Clinical Summary 2”, “Problem List” and “Rehab” EMR-subset embeddings.

EMR-subsets	EMR-News	EMR	News	Clinical Summary 2	Problem List	Rehab

melanoma	leukemia	cancem	lung cancer	anut	Prostate	ca
breast	hashimoto	cnacer	colon cancer	paternal	Lung	carcinoma
prostate	malignancies	endocrinopathies	leukemia	deceased	colon	melanoma
carcinoma	nonpolyposis	at age	cancers	maternal	breast	xrt
metastatic	diabetes	cousins	liver cancer	grandmother	father	ademiocarcinoma
colon	cancer	gf	brain tumor	Uncle	maternal	lumpectomy
malignant	alzheimer	social history	brain tumors	Sister	skin	prostate
tumor	hpth	grandfather	bladder cancer	Father	oid	chemoxrt
radiation	sitosterolemia	meopausal	prostrate cancer	Mother	diabetes	chemoradiation
ca	masectomy	cance	colorectal cancer	Grandfather	of	malignant

Open in a new tab

Table-B: the similar terms of "epilepsy" provided by provided by the EMR-subsets, EMR-News, EMR and News similar term extraction methods and the “Clinical Summary”, “Clinical Summary” EMR-subset embeddings,

EMR-subsets	EMR-News	News	EMR	Clinical Summary	Clinical Summary 2

seizures	jme	schizophrenia	eses	clonipin	seizure
seizure	onti	bipolar disorder	localization	emu	seizures
eeg	clobazam	intractable epilepsy	epileptic	vimpat	epileptic
intractable	Pgb	Epilepsy	semiology	amir	intractable
keppra	vimpat	ADHD	jme	arain	gastaut
epileptic	epilepsies	Lennox Gastaut syndrome	clobazam	hasan	staring
tonic	epileptic	epileptic seizures	veeg	somnezturk	cerebral
myoclonic	epileptiform	dystonia	generalization	bassel	lemiox
bipolar	dystonia	multiple sclerosis	wada	klialil	myoclonic
neurology	astatic	Dravet syndrome	cbz	neuro	siezures

Open in a new tab

Table-C: the similar terms of "ventilator" provided by provided by the EMR-subsets, EMR-News, EMR and News similar term extraction methods and the “Clinical Summary”, “Clinical Summary 2” and “Rehab” EMR-subset embeddings.

EMR-subsets EMR-News			News	EMR			Clinical Summary Clinical Summary 2 Rehab

vent \| cmv			respirator	j servo			vent ; vent ; canula
trach	ventilators	mechanical ventilator			asset	hme		ventilation	vent
cannula	bedside	intensive care			hed	trach		intubation	hfov
intubation	flolan	ventilators			simy	humidified		gj	vapotherm
tracheostomy	ventilation	breathing tube			idb	saturations		requirement	ncpap
oxygen	intubation	artificial res0070irator			model	tracheostomy		peep	extubation
ventilation	extubated	ECMO machine			drager	percussion		vapotherm	hljv
bipap	nebulizer	ventilator support			serial	cannula		pccu	intubation
sats	cannula	Intensive Care Unit			rrt	cuffed		weaned	simv
intubated	weaned	tracheotomy tube			monitor	acapella		cooling	ventilation

Open in a new tab

Table-D: the similar terms of "EEG" provided by provided by the EMR-subsets, EMR-News, EMR and News similar term extraction methods and the “Clinical Summary 2”, “Prescription” EMR-subset embeddings.

EMR-subsets	EMR-News	News	EMR	Clinical Summary 2	Prescription

seizure	pdr	electroencephalogram EEG	discharges	emu	vlkuham
seizures	epileptiform	electroencephalograph	pdr	wada	seizure
emu	fosphenytoin	electroencephalogram	interictally	ictal	aed
epilepsy	hypsarrhythmia	electroencephalograph EEG	generalized	deprived	keppra
mri	alternant	EEGs	voltage	epileptogenicity	emu
brain	jme	evoked potentials	frontally	epileptiform	epilepsy
spells	opisthotonus	electroencephalograph EEG	fronto	discharges	pinaqvi
aed	amobarbital	electroencephalography	electrographic	nmri	seizures
staring	clonic	electroencephalograms	epileptogenici	seizures	gnb
neurology	frontopolar	brainwaves	theta	seizure	vimpat

Open in a new tab

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Competing interests: None.

Provenance and peer review: Not commissioned; externally peer reviewed.

REFERENCES

[1].Rasmussen LV, The electronic health record for translational research, J. Cardiovasc. Transl. Res. 7 (2014) 607–614. doi: 10.1007/sl2265-014-9579-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
[2].Chen L, Guo U, Illipparambil LC, Netherton MD, Sheshadri B, Karu E, Peterson SJ, Mehta PH, Racing Against the Clock: Internal Medicine Residents’ Time Spent On Electronic Health Records, J. Grad. Med. Educ. 8 (2016) 39–44. doi: 10.4300/JGME-D-15-00240.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
[3].Hripcsak G, Vawdrey DK, Fred MR, Bostwick SB, Use of electronic clinical documentation: time spent and team interactions, J Am Med Inf. Assoc. 18 (2011) 112–117. doi: 10.1136/jamia.2010.008441. [DOI] [PMC free article] [PubMed] [Google Scholar]
[4].Lai KH, Topaz M, Goss FR, Zhou L, Automated misspelling detection and correction in clinical free-text records., J. Biomed. Inform. 55 (2015) 188–95. doi: 10.1016/j.jbi.2015.04.008. [DOI] [PubMed] [Google Scholar]
[5].Henriksson A, Moen H, Skeppstedt M, Daudaravicius V, Duneld M, Synonym extraction and abbreviation expansion with ensembles of semantic spaces., J. Biomed. Semantics. 5 (2014) 6. doi: 10.1186/2041-1480-5-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
[6].Biron P, Metzger MH, Pezet C, Sebban C, Barthuet E, Durand T, An information retrieval system for computerized patient records in the context of a daily hospital practice: the example of the Leon Berard Cancer Center (France]., Appl. Clin. Inform. 5 (2014) 191–205. doi: 10.4338/ACI-2013-08-CR-0065. [DOI] [PMC free article] [PubMed] [Google Scholar]
[7].Natarajan K, Stein D, Jain S, Elhadad N, An analysis of clinical queries in an electronic health record search utility, Int. J. Med. Inform. 79 (2010) 515–522. doi: 10.1016/j.ijmedinf.2010.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
[8].Tawfik AA, Kochendorfer KM, Saparova D, S. A1 Ghenaimi, J.L. Moore, “I don’t have time to dig back through this”: The role of semantic search in supporting physician information seeking in an electronic health record. Perform. Improv. Q. 26 (2014) 75–91. doi: 10.1002/piq.21158. [DOI] [Google Scholar]
[9].Zalis M, Harris M, Advanced search of the electronic medical record: Augmenting safety and efficiency in radiology, J. Am. Coll. Radiol. 7 (2010) 625–633. doi: 10.1016/j.jacr.2010.03.011. [DOI] [PubMed] [Google Scholar]
[10].Gregg W, Jirjis J, Lorenzi NM, Giuse D, StarTracker: an integrated, web-based clinical search engine., AMIA Annu. Symp. Proc. (2003) 855. http://www.ncbi.nlm.nih.gov/pubmed/14728360 (accessed October 24, 2016. [PMC free article] [PubMed] [Google Scholar]
[11].Hanauer DA, Mei Q, Law J, Khanna R, Zheng K, Supporting information retrieval from electronic health records: A report of University of Michigan’s nine-year experience in developing and using the Electronic Medical Record Search Engine (EMERSE],J. Biomed. Inform. 55 (2015) 290–300.doi: 10.1016/j.jbi.2015.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
[12].Ooi J, Ma X, Qin H, Liew SC, A survey of query expansion, query suggestion and query refinement techniques, 2015 4th Int. Conf. Softw. Eng. Comput. Syst. ICSECS 2015 Virtuous Softw. Solut. Big Data. (2015) 112–117. doi: 10.1109/ICSECS.2015.7333094. [DOI] [Google Scholar]
[13].Goodwin T, Harabagiu SM, UTD atTREC 2014: Query Expansion for Clinical Decision Support, 23rd Text Retr. Conf. (TREC 2014) Proc. 1 (2014), [Google Scholar]
[14].Pal D, Mitra M, Bhattacharya S, Exploring Query Categorisation for Query Expansion : A Study, arXiv Prepr. arXivl509.05567. (2015) 1–34.http://arxiv.Org/pdf/1509.05567vl.pdf%5Cnhttp://arxiv.org/abs/1509.05567. [Google Scholar]
[15].NIH-NLM SNOMED Clinical Terms® (SNOMED CT®], NIH-US Natl. Libr. Med. (2015), http://www.nlm.nih.gov/research/umls/Snomed/snomed_main.html. [Google Scholar]
[16].Martinez D, Otegi A, Soroa A, Agirre E, Improving search over Electronic Health Records using UMLS-based query expansion through random walks, J. Biomed. Inform. 51 (2014) 100–106. doi: 10.1016/j.jbi.2014.04.013. [DOI] [PubMed] [Google Scholar]
[17].Pennington J, Socher R, Manning CD, GloVe: Global Vectors for Word Representation, Proc. 2014 Conf. Empir. Methods Nat. Lang. Process. (2014) 1532–1543. doi: 10.3115/vl/D14-1162. [DOI] [Google Scholar]
[18].Speer R, Chin J, An Ensemble Method to Produce High-Quality Word Embeddings, Arxiv. (2016). http://arxiv.org/abs/1604.01692.
[19].Pakhomov SVS, Finley G, McEwan R, Wang Y, Melton GB, Corpus domain effects on distributional semantic modeling of medical terms. Bioinformatics. 32 (2016) 3635–3644. doi: 10.1093/bioinformatics/btw529. [DOI] [PMC free article] [PubMed] [Google Scholar]
[20].Zhu D, Wu S, Carterette B, Liu H, Using large clinical corpora for query expansion in text-based cohort identification,]. Biomed. Inform. 49 (2014) 275–281. doi: 10.1016/j.jbi.2014.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
[21].Hanauer DA, Wu DTY, Yang L, Mei Q, Murkowski-Steffy KB, Vydiswaran VGV, Zheng K, Development and empirical user-centered evaluation of semantically-based query recommendation for an electronic health record search engine, J. Biomed. Inform. 67 (2017) 1–10. doi: 10.1016/j.jbi.2017.01.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
[22].Mikolov T, Corrado G, Chen K, Dean J, Efficient Estimation of Word Representations in Vector Space, Proc. Int. Conf. Learn. Represent. (ICLR 2013), (2013) 1–12. doi: 10.1162/153244303322533223. [DOI] [Google Scholar]
[23].Mikolov T, Chen K, Corrado G, Dean J, Distributed Representations of Words and Phrases and their Compositionality, Nips. (2013) 1–9. doi: 10.1162/jmlr.2003.3.4-5.951. [DOI] [Google Scholar]
[24].Jin M, Li H, Schmid CH, Wallace BC, Using Electronic Medical Records and Physician Data to Improve Information Retrieval for Evidence-Based Care, IEEE Int. Conf. Healthc. Informatics. (2016). doi: 10.1109/ICHI.2016.12. [DOI] [Google Scholar]
[25].Rehurek R, Sojka P, Software Framework for Topic Modelling with Large Corpora, in: Proc. Lr. 2010 Work. New Challenges NLP Fram., ELRA, 2010: pp. 45–50. [Google Scholar]
[26].Roden DM, Pulley JM, Basford MA, Bernard GR, Clayton EW, Balser JR, Masys DR, Development of a Large-Scale De-Identified DNA Biobank to Enable Personalized Medicine, Clin. Pharmacol. Ther. 84 (2008) 363. [DOI] [PMC free article] [PubMed] [Google Scholar]
[27].Diaz F, Mitra B, Craswell N, Query Expansion with Locally-Trained Word Embeddings, arXiv Prepr. arXivl605.07891. (2016) 367–377. http://arxiv.org/abs/1605.07891 (accessed October 24,2016). [Google Scholar]
[28].Richardson L, Beautiful Soup Documentation, (2016) 1–72. http://www.crummy.com/software/BeautifulSoup/bs4/doc/. [Google Scholar]
[29].Buhrmester M, Kwang T, Gosling SD, Amazon’s Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data?, Perspect. Psychol. Sci. 6 (2011) 3–5. doi: 10.1177/1745691610393980. [DOI] [PubMed] [Google Scholar]
[30].Starkweather J, Moske AK, Multinomial logistic regression. Multinomial Logist. Regres 51 (2011) 404–410. doi: 10.1097/00006199-200211000-00009. [DOI] [Google Scholar]
[31].Group SC, MULTINOMIAL LOGISTIC REGRESSION | R DATA ANALYSIS EXAMPLES, (2014), https://stats.idre.ucla.edU/r/dae/multinomial-logistic-regression/.
[32].Shivade C, Raghavan P, Fosler-Lussier E, Embi PJ, Elhadad N, Johnson SB, Lai AM, A review of approaches to identifying patient phenotype cohorts using electronic health records, J. Am. Med. Informatics Assoc. 21 (2014) 221–230. doi: 10.1136/amiajnl-2013-001935. [DOI] [PMC free article] [PubMed] [Google Scholar]
[33].Ganesan K, Lloyd S, Sarkar V, Discovering Related Clinical Concepts Using Large Amounts of Clinical Notes., Biomed. Eng. Comput. Biol. 7 (2016) 27–33. doi: 10.4137/BECB.S36155. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] [1].Rasmussen LV, The electronic health record for translational research, J. Cardiovasc. Transl. Res. 7 (2014) 607–614. doi: 10.1007/sl2265-014-9579-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] [2].Chen L, Guo U, Illipparambil LC, Netherton MD, Sheshadri B, Karu E, Peterson SJ, Mehta PH, Racing Against the Clock: Internal Medicine Residents’ Time Spent On Electronic Health Records, J. Grad. Med. Educ. 8 (2016) 39–44. doi: 10.4300/JGME-D-15-00240.1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] [3].Hripcsak G, Vawdrey DK, Fred MR, Bostwick SB, Use of electronic clinical documentation: time spent and team interactions, J Am Med Inf. Assoc. 18 (2011) 112–117. doi: 10.1136/jamia.2010.008441. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] [4].Lai KH, Topaz M, Goss FR, Zhou L, Automated misspelling detection and correction in clinical free-text records., J. Biomed. Inform. 55 (2015) 188–95. doi: 10.1016/j.jbi.2015.04.008. [DOI] [PubMed] [Google Scholar]

[R5] [5].Henriksson A, Moen H, Skeppstedt M, Daudaravicius V, Duneld M, Synonym extraction and abbreviation expansion with ensembles of semantic spaces., J. Biomed. Semantics. 5 (2014) 6. doi: 10.1186/2041-1480-5-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] [6].Biron P, Metzger MH, Pezet C, Sebban C, Barthuet E, Durand T, An information retrieval system for computerized patient records in the context of a daily hospital practice: the example of the Leon Berard Cancer Center (France]., Appl. Clin. Inform. 5 (2014) 191–205. doi: 10.4338/ACI-2013-08-CR-0065. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] [7].Natarajan K, Stein D, Jain S, Elhadad N, An analysis of clinical queries in an electronic health record search utility, Int. J. Med. Inform. 79 (2010) 515–522. doi: 10.1016/j.ijmedinf.2010.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] [8].Tawfik AA, Kochendorfer KM, Saparova D, S. A1 Ghenaimi, J.L. Moore, “I don’t have time to dig back through this”: The role of semantic search in supporting physician information seeking in an electronic health record. Perform. Improv. Q. 26 (2014) 75–91. doi: 10.1002/piq.21158. [DOI] [Google Scholar]

[R9] [9].Zalis M, Harris M, Advanced search of the electronic medical record: Augmenting safety and efficiency in radiology, J. Am. Coll. Radiol. 7 (2010) 625–633. doi: 10.1016/j.jacr.2010.03.011. [DOI] [PubMed] [Google Scholar]

[R10] [10].Gregg W, Jirjis J, Lorenzi NM, Giuse D, StarTracker: an integrated, web-based clinical search engine., AMIA Annu. Symp. Proc. (2003) 855. http://www.ncbi.nlm.nih.gov/pubmed/14728360 (accessed October 24, 2016. [PMC free article] [PubMed] [Google Scholar]

[R11] [11].Hanauer DA, Mei Q, Law J, Khanna R, Zheng K, Supporting information retrieval from electronic health records: A report of University of Michigan’s nine-year experience in developing and using the Electronic Medical Record Search Engine (EMERSE],J. Biomed. Inform. 55 (2015) 290–300.doi: 10.1016/j.jbi.2015.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] [12].Ooi J, Ma X, Qin H, Liew SC, A survey of query expansion, query suggestion and query refinement techniques, 2015 4th Int. Conf. Softw. Eng. Comput. Syst. ICSECS 2015 Virtuous Softw. Solut. Big Data. (2015) 112–117. doi: 10.1109/ICSECS.2015.7333094. [DOI] [Google Scholar]

[R13] [13].Goodwin T, Harabagiu SM, UTD atTREC 2014: Query Expansion for Clinical Decision Support, 23rd Text Retr. Conf. (TREC 2014) Proc. 1 (2014), [Google Scholar]

[R14] [14].Pal D, Mitra M, Bhattacharya S, Exploring Query Categorisation for Query Expansion : A Study, arXiv Prepr. arXivl509.05567. (2015) 1–34.http://arxiv.Org/pdf/1509.05567vl.pdf%5Cnhttp://arxiv.org/abs/1509.05567. [Google Scholar]

[R15] [15].NIH-NLM SNOMED Clinical Terms® (SNOMED CT®], NIH-US Natl. Libr. Med. (2015), http://www.nlm.nih.gov/research/umls/Snomed/snomed_main.html. [Google Scholar]

[R16] [16].Martinez D, Otegi A, Soroa A, Agirre E, Improving search over Electronic Health Records using UMLS-based query expansion through random walks, J. Biomed. Inform. 51 (2014) 100–106. doi: 10.1016/j.jbi.2014.04.013. [DOI] [PubMed] [Google Scholar]

[R17] [17].Pennington J, Socher R, Manning CD, GloVe: Global Vectors for Word Representation, Proc. 2014 Conf. Empir. Methods Nat. Lang. Process. (2014) 1532–1543. doi: 10.3115/vl/D14-1162. [DOI] [Google Scholar]

[R18] [18].Speer R, Chin J, An Ensemble Method to Produce High-Quality Word Embeddings, Arxiv. (2016). http://arxiv.org/abs/1604.01692.

[R19] [19].Pakhomov SVS, Finley G, McEwan R, Wang Y, Melton GB, Corpus domain effects on distributional semantic modeling of medical terms. Bioinformatics. 32 (2016) 3635–3644. doi: 10.1093/bioinformatics/btw529. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] [20].Zhu D, Wu S, Carterette B, Liu H, Using large clinical corpora for query expansion in text-based cohort identification,]. Biomed. Inform. 49 (2014) 275–281. doi: 10.1016/j.jbi.2014.03.010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] [21].Hanauer DA, Wu DTY, Yang L, Mei Q, Murkowski-Steffy KB, Vydiswaran VGV, Zheng K, Development and empirical user-centered evaluation of semantically-based query recommendation for an electronic health record search engine, J. Biomed. Inform. 67 (2017) 1–10. doi: 10.1016/j.jbi.2017.01.013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] [22].Mikolov T, Corrado G, Chen K, Dean J, Efficient Estimation of Word Representations in Vector Space, Proc. Int. Conf. Learn. Represent. (ICLR 2013), (2013) 1–12. doi: 10.1162/153244303322533223. [DOI] [Google Scholar]

[R23] [23].Mikolov T, Chen K, Corrado G, Dean J, Distributed Representations of Words and Phrases and their Compositionality, Nips. (2013) 1–9. doi: 10.1162/jmlr.2003.3.4-5.951. [DOI] [Google Scholar]

[R24] [24].Jin M, Li H, Schmid CH, Wallace BC, Using Electronic Medical Records and Physician Data to Improve Information Retrieval for Evidence-Based Care, IEEE Int. Conf. Healthc. Informatics. (2016). doi: 10.1109/ICHI.2016.12. [DOI] [Google Scholar]

[R25] [25].Rehurek R, Sojka P, Software Framework for Topic Modelling with Large Corpora, in: Proc. Lr. 2010 Work. New Challenges NLP Fram., ELRA, 2010: pp. 45–50. [Google Scholar]

[R26] [26].Roden DM, Pulley JM, Basford MA, Bernard GR, Clayton EW, Balser JR, Masys DR, Development of a Large-Scale De-Identified DNA Biobank to Enable Personalized Medicine, Clin. Pharmacol. Ther. 84 (2008) 363. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] [27].Diaz F, Mitra B, Craswell N, Query Expansion with Locally-Trained Word Embeddings, arXiv Prepr. arXivl605.07891. (2016) 367–377. http://arxiv.org/abs/1605.07891 (accessed October 24,2016). [Google Scholar]

[R28] [28].Richardson L, Beautiful Soup Documentation, (2016) 1–72. http://www.crummy.com/software/BeautifulSoup/bs4/doc/. [Google Scholar]

[R29] [29].Buhrmester M, Kwang T, Gosling SD, Amazon’s Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data?, Perspect. Psychol. Sci. 6 (2011) 3–5. doi: 10.1177/1745691610393980. [DOI] [PubMed] [Google Scholar]

[R30] [30].Starkweather J, Moske AK, Multinomial logistic regression. Multinomial Logist. Regres 51 (2011) 404–410. doi: 10.1097/00006199-200211000-00009. [DOI] [Google Scholar]

[R31] [31].Group SC, MULTINOMIAL LOGISTIC REGRESSION | R DATA ANALYSIS EXAMPLES, (2014), https://stats.idre.ucla.edU/r/dae/multinomial-logistic-regression/.

[R32] [32].Shivade C, Raghavan P, Fosler-Lussier E, Embi PJ, Elhadad N, Johnson SB, Lai AM, A review of approaches to identifying patient phenotype cohorts using electronic health records, J. Am. Med. Informatics Assoc. 21 (2014) 221–230. doi: 10.1136/amiajnl-2013-001935. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] [33].Ganesan K, Lloyd S, Sarkar V, Discovering Related Clinical Concepts Using Large Amounts of Clinical Notes., Biomed. Eng. Comput. Biol. 7 (2016) 27–33. doi: 10.4137/BECB.S36155. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews

Cheng Ye, M.S.

Daniel Fabbri, Ph.D.

Abstract

Objective:

Materials and Methods:

Results:

Conclusions:

BACKGROUND AND SIGNIFICANCE

MATERIALS AND METHODS

Semantic Embeddings

Table 1.

Similar Terms Extraction

Figure 1.

Figure 2.

Figure 3.

Baseline Methods

Evaluations

User Preference Study

Table 2.

Figure 4.

Information Retrieval Experiments

Table 3.

Elbow Method Evaluation

Time Efficiency Experiment

RESULTS

User Preference Study

Table 4.

Table 5.

Information Retrieval Performance

Table 6.

Table 7.

Table 8.

Elbow Method Evaluation

Table 9.

Time Efficiency Analysis

DISCUSSION

CONCLUSION

Table 10.

Highlights.

Acknowledgments:

Appendix

Footnotes

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases