Abstract
Unstructured data stored in an electronic health record (EHR) system can be very informative but require techniques such as natural language processing to extract the information. Developing such techniques requires shared data, but clinical data are often not easy to access. A freely available intensive care unit database, MIMIC-III, was released in 2016 to address this issue and benefit the informatics research community. While the database has been utilized by a few studies, the text characteristics of the notes have not been summarized. In this study, we present the summary of the basic text characteristics and the readability of the MIMIC-III ICU notes. We further compare the results with our previous study where proprietary EHR notes were used. The results show that the text characteristics of MIMIC-III notes were comparable with proprietary EHR notes, although the note readability index was slightly lower. The clinical notes in MIMIC-III can be a viable option for researchers who are interested in clinicians’ language use but have no access to proprietary EHR systems.
Introduction
Electronic health record (EHR) systems have been widely adopted in the United States so that abundant clinical data are constantly being generated1. While structured data are preferred for analysis due to their machine-readable format, unstructured data also contain information with high expressivity about patient medical history, clinical analyses, care processes, and treatments2. Unstructured data are easy for clinicians to record by typing or dictation. However, unstructured information locked in sentences cannot be analyzed in mass quantities without techniques such as information retrieval, text mining, and natural language processing (NLP)3 to process the data and extract information for further use.
One barrier to developing NLP techniques for clinical notes is the lack of access to shared data4. Contributing factors include proprietary commercial EHR systems, protection of patient privacy, and worries about revealing care quality issues through note sharing. Fortunately, a freely available EHR dataset has been released. This dataset, called MIMIC, contains comprehensive EHRs of intensive care units (ICUs). MIMIC was originally developed by the Massachusetts Institute of Technology Lab for Computational Physiology and is currently in its third version (MIMIC-III). The MIMIC database includes both structured and unstructured data and has been used in several research studies for different purposes, such as health state estimation6, prognosis prediction8, clinical sentiment analysis9, mortality risk prediction10, and named-entity recognition11.
Furthermore, our literature search on PubMed using “MIMIC” and “intensive care” as query terms shows that there were 13 text mining-related publications in 768 relevant records. Most of the studies used nursing notes and discharge summaries. A notable recent study compared a deep learning method (convolutional neural networks) with concept extraction-based methods on the MIMIC-III database and reported its high performance to support patient phenotyping and cohort identification7. However, no studies have attempted to summarize the text characteristics of the MIMIC notes and compare the text characteristics with other EHRs. The closest we can find was done by Marafino et al., who developed and characterized sparse classifiers on the MIMIC-II nursing notes10. Effort to summarize text characteristics can benefit informatics research communities and further benefit clinicians through translating research findings into clinical practices.
In this study, we summarized text characteristics of MIMIC-III notes with a focus on surface metrics, such as average sentences length and vocabulary coverage, and computed the readability levels of these clinical notes. The notes were grouped based on their recorded types in the dataset and scored by multiple readability formulas. The differences among the scores of each group were examined. The results were further compared with findings in our previous study where unstructured clinal data were collected from a proprietary EHR system12.
Methods
Dataset
The MIMIC-III database, distributed by PhysioNet14, contains rich health information of 46,520 critical care patients. This dataset is freely available and large in scale. Before obtaining and accessing data, all the authors of this paper finished the required training and signed the data use agreement. Our copy of MIMIC-III dataset was stored in a PostgreSQL dataset system (PostgreSQL Association, https://www.postgresql.org/) on our secured server behind the firewall of the University of Cincinnati College of Medicine. The MIMIC-III dataset contained more than two million patient notes entered between 2001 and 2012 in ICUs of the Beth Israel Deaconess Medical Center. These notes had 15 different note types, including nursing notes and radiology reports (Table 1). A total of 886 records (0.04%) were eliminated in the subsequent analyses due to their known error status (Error flag=1), leading to 2,082,294 records included in the current study. The MIMIC-III clinical notes have additional metadata, including note entering datetime, note (sub) types and their descriptions, and the caregiver identifier. It is worth noting that we did not further clean the note content, keeping their original format as well as any special language use such as acronyms, synonyms, and misspellings. The text characteristics were summarized through a set of surface metrics (e.g. average sentence length) and readability measures, which are explained in the following sections.
Table 1.
The category names and the distribution of clinical notes
| Note Type | Total | Percentage |
|---|---|---|
| Nursing/other | 822,497 | 39.48% |
| Radiology | 522,279 | 25.07% |
| Nursing | 223,182 | 10.73% |
| ECG | 209,051 | 10.04% |
| Physician | 141,281 | 6.80% |
| Others* | 164,034 | 7.88% |
| Total | 2,082,294 | 100.00% |
Discharge Summary, Echo, Respiratory, Nutrition, General, Rehab Services, Social Work, Case Management, Case Management, Pharmacy, and Consult
Surface metrics
The note text was characterized using surface metrics including average document length, average sentence count, vocabulary size (number of distinct words across all documents), and vocabulary coverage. Vocabulary in the notes was compared against two open-source dictionaries, namely GNU Aspell15 and OpenMedSpel16. The former is a general English dictionary while the latter is a medical-specific dictionary. The vocabulary coverage was calculated against the general English dictionary, the medical-specific dictionary, and the union of both. These dictionaries have been used in previous studies to assess the vocabulary coverage of clinical text12, 17, 18. Table 2 lists all surface metrics and their definitions.
Table 2.
Surface metrics
| Metric | Definition |
|---|---|
| Average Document Length | Average number of sentences per document |
| Average Sentence Length | Average number of words per sentence |
| Vocabulary Size | Total distinct number of words |
| Vocabulary Coverage | Number of words covered by a dictionary normalized by the vocabulary Size |
Readability measures
Several readability indices have been developed to provide an estimated grade level required to read a piece of text. These grade levels are presented in terms of numbers of years of education, e.g. 10th grade. Clinical text is particularly difficult to read and comprehend because the pervasive use of professional terms, acronyms, abbreviations, and local jargon. Readability is especially important for patient consent forms and discharge summaries where patients’ comprehension of text is critical to achieve communication effectiveness. Studies have shown that even a highly educated patient will find many radiology reports and discharge summary statements incomprehensible19. It is worth noting that while readability is an important text property, it is a necessary but not sufficient requirement to text comprehension and patient communication effectiveness20.
Four readability measures were adopted to assess the MIMIC-III clinical notes, including Flesch-Kincaid Grade Level (FKGL)21, Simple Measure of Gobbledygook (SMOG)22, Gunning-Fog Index (GFI)23 and Dale-Chall (DC)24. These readability measures are considered as “classic” measures due to their general design and potential limitations and have been widely used since the 1970’s. Microsoft Word25 has implemented FKGL to report the readability of documents. Despite the wide adoption, these readability measures consider few text characteristics and may have limitations in scoring medical text12,30. However, due to the simplicity of calculation and ease of interpretation on the scores, these classic methods were still chosen in the present study to provide an initial assessment on the readability of the MIMIC-III note text.
Data Analysis
The main data source of the present study was the note records in the “NOTEEVENTS” table in the MIMIC-III dataset. The data were processed by the python NLTK toolkit (Python Software Foundation, https://www.python.org/; NLTK Project, https://nltk.org)26 to generate the summary of surface metrics. Three readability indices, namely FKGL, SMOG, GFI, were computed by an existing Python library called “Readability”29. The forth readability index (DC) was computed by a self-developed Python script. A random sample of notes was examined manually to ensure the correctness of calculation.
The readability scores were further compared among groups to examine statistically significant differences. The scores were grouped in two ways: 1) by the four readability measures and 2) by the six note types. The former helped demonstrating the agreement among the readability measures. That is, given a piece of clinical text, would the classic readability measures agree with each other in terms of grade level. On the other hand, the latter helped demonstrate any significant differences among the note types. Since one note was scored by multiple measures, an average score of all measures was produced for the comparison among note types. In terms of statistical test for multiple group comparison, the normality of the distribution in each group was first examined. If all distributions are normal, one-way ANOVA and Tukey’s HSD with Bonferroni correction will be used to examine the group mean difference. If any of the distributions is non-normal, the corresponding non-parametric test (Kruskal-Wallis test at 0.05 significance level) for group median comparison will be applied33,34.
Results
Surface metrics
Table 3 shows the summary of surface metrics in the six major note types. The physician notes had the highest average sentence length (72.59±118.48) and relatively high average document length (34.25±26.66), while the ECG notes had the highest vocabulary coverage (40%) by the general English dictionary. The discharge summaries had the highest average document length (109.52±63.28) and second largest vocabulary size (123,470). For text in image-related notes (i.e., Radiology and ECG), they have relatively lower numbers in vocabulary size, average document length, and average sentence. These image- related notes also used many standardized terms according to their high vocabulary coverage (around 45% overall).
Table 3.
Surface metrics of major note categories
| Nursing/other | Radiology | Nursing | ECG | Physician | Discharge summary | All Types | |
|---|---|---|---|---|---|---|---|
| Average Document Length (std) | 16.19 (13.77) | 14.42 (10.63) | 21.01 (16.61) | 3.78 (2.13) | 34.25 (26.66) | 109.52* (63.28) | 19.14 (24.43) |
| Average Sentence Length (std) | 15.74 (25.34) | 19.64 (9.33) | 26.65 (39.11) | 12.77 (11.35) | 72.59* (118.48) | 17.64 (10.14) | 23.14 (43.03) |
| Vocabulary Size | 197,745* | 34,166 | 73,444 | 7,527 | 80,832 | 123,470 | 347,238 |
| General Vocabulary Coverage | 13% | 32% | 25% | 40%* | 23% | 20% | 9% |
| Medical Vocabulary Coverage | 8% | 28%* | 17% | 20% | 19% | 17% | 7% |
| Union Vocabulary Coverage | 16% | 46%* | 33% | 44% | 33% | 31% | 14% |
highest in each surface metric
The note type “Nursing/other” contains supplemental information of patient status (similar to progress note) and has a very large vocabulary size. However, only a small portion of vocabulary is covered by the two dictionaries. Based on the examination of a random sample of 100 “Nursing/other” notes, the low vocabulary coverage may result from the pervasive use of abbreviations and acronyms that cannot be recognized by the dictionaries. Overall, all notes did not have high vocabulary coverage (31-46% for major types, 14% on average), which is consistent with a previous study where low dictionary coverage was common in notes written by clinicians12, 17.
Readability Measures
Table 4 shows the average readability score with the standard deviation of all groups. Not surprisingly, physician notes had the highest average readability score due to their high average sentence length (Table 3). SMOG and GFI seemed to have comparable scores, while FKGL and DC generated similar numbers. Also, the nursing notes had relatively low readability scores compared with other types. Based on our manual examination, it may be caused by the succinct nursing note structure (smaller average sentence length), which would favor the classic readability indices and lead to a lower readability score. Comparing with previous studies12,32, the readability scores of MIMIC-III clinical notes present similar trends, where SMOG and GFI seemed to score the readability of medical or health text higher, and the score variation among the readability measures can be up to 5 grading levels on the same text.
Table 4.
Average Readability scores of all notes
| Measure | Nursing/other | Radiology | Nursing | ECG | Physician | Discharge Summary | ALL |
|---|---|---|---|---|---|---|---|
| FKGL(std) | 4.09 (4.68) | 8.57 (2.45) | 6.89 (4.60) | 7.62 (4.39) | 11.12 (4.23) | 6.27 (2.34) | 6.7 (4.72) |
| SMOG (std) | 9.48 (4.08) | 12.86 (1.73) | 12.34 (4.45) | 12.69 (3.88) | 14.83 (4.54) | 11.7 (1.29) | 11.67 (4.27) |
| GFI (std) | 9.02 (3.85) | 13.63 (2.06) | 12.27 (3.09) | 13.97 (2.95) | 15.54 (2.01) | 12.23 (1.79) | 11.8 (3.89) |
| DC (std) | 5.59 (2.45) | 7.08 (0.78) | 6.9 (2.08) | 8.33 (1.42) | 8.34 (5.08) | 6.88 (0.69) | 6.74 (2.55) |
| Average | 7.05 | 10.54 | 9.6 | 10.65 | 12.46 | 9.27 | 9.23 |
Another observation is that the readability scores have a high variation (standard deviation) in each group. A further investigation on the score distribution based on the grade level is summarized in Table 5. As can be seen, many scores were below grade 6 (elementary school) or above grade 17 (graduate school). SMOG readability index seems to perform more consistently and has fewer outliers, whereas FLKL has a much larger variation. To minimize the potential bias that these potential outlier scores may bring, clinical notes with at least one score below 6 or above 17 were dropped. That is, only those notes with all four readability scores between grade 6 and 17 were selected in the following score comparison and statistical test.
Table 5.
Readability grade levels summary for all notes by measures
| Measure | Grade level <6 | Grade level 6-17* | Grade level >17 |
|---|---|---|---|
| FLKL | 965,985 (46%) | 939,462 (45%) | 177,499 (9%) |
| SMOG | 101,549 (5%) | 1,829,224 (88%) | 152,173 (7%) |
| GFI | 187,646 (9%) | 1,555,139 (75%) | 340,161 (16%) |
| DC | 498,393 (24%) | 1,571,466 (75%) | 13,087 (1%) |
selected notes with all four readability scores between grade 6 and 17
The statistical analysis showed that the distribution of each readability group was not normal, and the median readability score of DC (7.24) were significantly lower than the median score of SMOG (12.59) and GFI (13.76). The median score of FKGL (8.26) seemed lower than SMOG and GFI and closer to DC. However, no significant statistical differences were found. This result indicates that the readability indices did not all agree with each other even after removing potential outliers in a conservative way. The DC method deviated the most from other methods. It may be because the DC scores were largely determined by the predefined word list38, which may not cover medical terms in a reasonable way and lead to lower grade levels. However, this may be an opportunity for DC to become a medical- specific measure by tailoring the word list to include medical terms. Figure 1 illustrates the distributions of the readability scores generated by the four selected readability measures. The SMOG index exhibits a lower level of variance with higher grade levels, which may be more suitable for analyses of different types of clinical documents in addition to typical health information materials.
Figure 1.
Variance in readability score of selected notes
Discussion
In this study, we reported the text characteristics of the MIMIC-III clinical notes in surface metrics and readability. Our findings showed that the major types of MIMIC-III clinical notes had their own distinct patterns. Physician notes had the highest average sentence length, which directly contributed to the higher readability index (harder to read) generated by the classic readability measures. Discharge summaries tend to be lengthy. Nursing notes, on the other hand, have larger vocabulary size. Radiology and ECG reports seem to have smaller vocabulary size, shorter average document length, average sentence length, and higher vocabulary coverage than other types. While some of the text characteristics seem reasonable for the note types, others would need more investigation on the content to understand the detailed language use.
Moreover, we found that the MIMIC-III clinical notes have comparable text characteristics with the inpatient EHR dataset of our previous study12. Specifically, the vocabulary coverage in all major types of the notes is around 30-45%. The low vocabulary coverage may be improved through having spelling error corrections and/or using different dictionaries to identify the corpus with optimal coverage for ICU notes. On the other hand, the average readability scores are slightly lower than the university and college level (12 or more years of education), with the ECG notes being the hardest type. A comparable variation of readability scores (up to 5 grade levels) was also found32. We therefore encourage researchers who are interested in clinicians’ language use but have no access to proprietary EHR systems to utilize these freely available ICU notes for their research, with a caution that the findings based on ICU notes may be limited and not generalizable to various clinical settings.
Our results once again confirmed the limitations of classic readability measures to score medical text. The readability scores were largely affected by the key text characteristics such as average sentence length and the variation of scores was high. A recent study has shown that the variation of readability scores on the same text corpus can result from the sample size, the location of word sampling, and the format as well as calculator method32. Zeng-Treitler et al. has proposed and developed a medical-specific readability measure30. However, this tool has not been validated through human judgement and has not yet been widely disseminated and used. More research is needed to consider specific text characteristics in clinical notes and their impact on note readability. Moreover, user-centered evaluation is necessary to validate new measures to demonstrate their effectiveness.
This study has two limitations. First, we directly adapted the default note types in the MIMIC-III dataset without any modifications. However, these notes may not be well-categorized since there was no single category for progress notes. Also, the descriptions of note types were unclear even though more than 4000 unique subcategories were included in the database. Researchers who would like to use these clinical notes should review the note categorization and modify them based on their needs and goals. Second, we analyzed the clinical notes entirety and did not perform any parsing, nor did we clean the spelling errors. Also, it is known that clinical text is frequently copied from previous notes or templates31. Having duplicate text would carry text characteristics forward and can potentially affect the readability scores. Since this is an exploratory study, we chose to remain the original format to show the overall text characteristics.
Our next step is to investigate more text characteristics and the language use in the MIMIC-III notes. We are particularly interested in the language patterns that can affect text comprehension and patient communication, e.g. abbreviations, acronyms, hedge phrases13, and spelling errors35. After all, text comprehension and understandability are the major critical challenges when delivering discharge summaries and consent forms to patients. We will map the medical concepts in the MIMIC-III notes using techniques such as MetaMap36 and cTakes37 in an effort to move the analysis from the syntactic level to the semantic level. We also suspect that there are prevalent and significant copy- paste behaviors in this dataset and plan to identify duplicate pieces of text in each patient admission. Other future directions include exploring the temporal trend of text characteristics, addressing the methodological inconsistency of readability measures especially on medical text, and improving the readability of discharge summaries to increase patient engagement.
Conclusion
We analyzed the text characteristics of a freely available and large collection of ICU notes and reported the surface metrics and readability indices of them. The findings showed that the major note types had their own unique text characteristics while having comparable vocabulary coverage and readability scores with notes sampled from a proprietary EHR system in our previous study. We encourage researchers to utilize this shared dataset containing rich clinical language patterns to develop new informatics mythologies and solutions as well as conduct comparative studies.
Acknowledgements
We thank Dr. Andy Spooner and Mr. Karthikeyan Meganathan in the Department of Biomedical Informatics at the University of Cincinnati for their input to this study.
Figures & Table
References
- 1.Henry J, Pylypchuk Y, Searcy T, Patel V. Adoption of electronic health record systems among US non-federal acute care hospitals: 2008–2015. The Office of National Coordinator for Health Information Technology. 2016 [Google Scholar]
- 2.Rosenbloom ST, Denny JC, Xu H, Lorenzi N, Stead WW, Johnson KB. Data from clinical notes: a perspective on the tension between structure and flexible documentation. Journal of the American Medical Informatics Association: JAMIA. 2011;18(2):181–6. doi: 10.1136/jamia.2010.007237. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gonzalez-Hernandez G, Sarker A, O'Connor K, Savova G. Capturing the patient’s perspective: a review of advances in natural language processing of health-related text. Yearbook of medical informatics. 2017;26(01):214–27. doi: 10.15265/IY-2017-029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Chapman WW, Nadkarni PM, Hirschman L, D'Avolio LW, Savova GK, Uzuner O. Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions. Journal of the American Medical Informatics Association: JAMIA. 2011;18(5):540–3. doi: 10.1136/amiajnl-2011-000465. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Johnson AEW, Pollard TJ, Shen L, Lehman L-wH, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible critical care database. Scientific Data. 2016;3:160035. doi: 10.1038/sdata.2016.35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Zalewski A, Long W, Johnson AEW, Mark RG, Lehman LwH. Estimating patient’s health state using latent structure inferred from clinical time series and text; 2017 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI); 2017. Feb, pp. 16–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Dai Y, Lokhandwala S, Long W, Mark R, Lehman L-wH. Phenotyping Hypotensive Patients in Critical Care Using Hospital Discharge Summaries; IEEE-EMBS International Conference on Biomedical and Health Informatics IEEE- EMBS International Conference on Biomedical and Health Informatics; 2017. pp. 401–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Farhan W, Wang Z, Huang Y, Wang S, Wang F, Jiang X. A Predictive Model for Medical Events Based on Contextual Embedding of Temporal Sequences. JMIR Medical Informatics. 2016;4(4):e39. doi: 10.2196/medinform.5977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ghassemi MM, Mark RG. A visualization of evolving clinical sentiment using vector representations of clinical notes. In: Nemati S, editor. 2015 Computing in Cardiology Conference (CinC); 2015. Sep, pp. 6–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Marafino BJ, John Boscardin W, Adams Dudley R. Efficient and sparse feature selection for biomedical text classification via the elastic net: Application to ICU risk stratification from nursing notes. Journal of Biomedical Informatics. 2015;54:114–20. doi: 10.1016/j.jbi.2015.02.003. [DOI] [PubMed] [Google Scholar]
- 11.Wu Y, Xu J, Jiang M, Zhang Y, Xu H. A Study of Neural Word Embeddings for Named Entity Recognition in Clinical Text. AMIA Annual Symposium Proceedings. 2015;2015:1326–33. [PMC free article] [PubMed] [Google Scholar]
- 12.Wu DTY, Hanauer DA, Mei Q, Clark PM, An LC, Lei J, et al. Applying Multiple Methods to Assess the Readability of a Large Corpus of Medical Documents. Studies in health technology and informatics. 2013;192:647–51. [PMC free article] [PubMed] [Google Scholar]
- 13.Hanauer DA, Liu Y, Mei Q, Manion FJ, Balis UJ, Zheng K. Hedging their Mets: The Use of Uncertainty Terms in Clinical Documents and its Potential Implications when Sharing the Documents with Patients. AMIA Annual Symposium Proceedings. 2012;2012:321–30. [PMC free article] [PubMed] [Google Scholar]
- 14.Goldberger AL AL, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation. 2000;101(23):e215–e0. doi: 10.1161/01.cir.101.23.e215. [DOI] [PubMed] [Google Scholar]
- 15.Atkinson K. GNU Aspell [Available from: http://aspell.net/
- 16.e-MedTools. OpenMedSpel 2017 [Available from: https://e-medtools.com/openmedspel.html.
- 17.Zheng K, Mei Q, Yang L, Manion FJ, Balis UJ, Hanauer DA. Voice-Dictated versus Typed-in Clinician Notes: Linguistic Properties and the Potential Implications on Natural Language Processing. AMIA Annual Symposium Proceedings. 2011;2011:1630–8. [PMC free article] [PubMed] [Google Scholar]
- 18.Wu DTY, Hanauer DA, Mei Q, Clark PM, An LC, Proulx J, et al. Assessing the readability of ClinicalTrials.gov. Journal of the American Medical Informatics Association. 2016;23(2):269–75. doi: 10.1093/jamia/ocv062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Zeng-Treitler Q, Goryachev S, Kim H, Keselman A, Rosendale D. Making Texts in Electronic Health Records Comprehensible to Consumers: A Prototype Translator. AMIA Annual Symposium Proceedings. 2007;2007:846–50. [PMC free article] [PubMed] [Google Scholar]
- 20.Garner M, Ning Z, Francis J. A framework for the evaluation of patient information leaflets. Health Expectations. 2012;15(3):283–94. doi: 10.1111/j.1369-7625.2011.00665.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Kincaid JP, Fishburne Jr RP, Rogers RL, Chissom BS. Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Naval Technical Training Command Millington TN Research Branch. 1975 [Google Scholar]
- 22.Mc Laughlin GH. SMOG Grading-a New Readability Formula. Journal of Reading. 1969;12(8):639–46. [Google Scholar]
- 23.Gunning R. Technique of clear writing. 1968 [Google Scholar]
- 24.Chall JS, Dale E. Readability revisited: The new Dale-Chall readability formula: Brookline Books. 1995 [Google Scholar]
- 25.Microsoft Corporation. Determine the reading level of a document in Word for Mac 2018 [Available from: https://support.office.com/en-us/article/determine-the-reading-level-of-a-document-in-word-for-mac-acec642a-f4e5-44ee-bb08-d47fb381bb94.
- 26.Bird S, Klein E, Loper E. Natural language processing with Python: analyzing text with the natural language toolkit: “ O'Reilly Media, Inc.”. 2009 [Google Scholar]
- 27.WIKIPEDIA. Cosine similarity [Available from: https://en.wikipedia.org/wiki/Cosine_similarity.
- 28.wikiHow. How to Calculate Spearman’s Rank Correlation Coefficient [Available from: https://www.wikihow.com/Calculate-Spearman%27s-Rank-Correlation-Coefficient.
- 29.Mautner M Readability 2014 [Available from: https://github.com/mmautner/readability.
- 30.Kim H, Zeng-Treitler Q, Goryachev S, Keselman A, Slaughter L. Text characteristics of clinical reports and their implications for the readability of personal health records2007: IOS Press. In: Arnott Smith C, editor. [PubMed] [Google Scholar]
- 31.Tsou AY, Lehmann CU, Michel J, Solomon R, Possanza L, Gandhi T. Safe Practices for Copy and Paste in the EHR: Systematic Review, Recommendations, and Novel Model for Health IT Collaboration. Applied Clinical Informatics. 2017;8(1):12–34. doi: 10.4338/ACI-2016-09-R-0150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Wang L-W, Miller MJ, Schmitt MR, Wen FK. Assessing readability formula differences with written health information materials: Application, results, and recommendations. Research in Social and Administrative Pharmacy. 2013;9(5):503–16. doi: 10.1016/j.sapharm.2012.05.009. [DOI] [PubMed] [Google Scholar]
- 33.Siegal S. Nonparametric statistics for the behavioral sciences: McGraw-hill. 1956 [Google Scholar]
- 34.Giraudoux P, Giraudoux MP. Package ‘pgirmess’. 2018 [Google Scholar]
- 35.Keselman A, Smith CA. A classification of errors in lay comprehension of medical documents. Journal of Biomedical Informatics. 2012;45(6):1151–63. doi: 10.1016/j.jbi.2012.07.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.MetaMap - A Tool for Recognizing UMLS Concepts in Text [Available from: https://metamap.nlm.nih.gov/
- 37.Apache cTAKESTM - clinical Text Analysis Knowledge Extraction System [Available from: http://ctakes.apache.org/
- 38.The Dale-Chall 3, 000 Word List for Readability Formulas [Available from: http://www.readabilityformulas.com/articles/dale-chall-readability-word-list.php.

