Abstract
Clinical documentation is often expressed in natural language text, yet providers often use common organizations that segment these notes in sections, such as “history of present illness” or “physical examination.” We developed a hierarchical section header terminology, supporting mappings to LOINC and other vocabularies; it contained 1109 concepts and 4332 synonyms. Physicians evaluated it compared to LOINC and the Evaluation and Management billing schema using a randomly selected corpus of history and physical notes. Evaluated documents contained a median of 54 sections and 27 “major sections.” There were 16,196 total sections in the evaluation note corpus. The terminology contained 99.9% of the clinical sections; LOINC matched 77% of section header concepts and 20% of section header strings in those documents. The section terminology may enable better clinical note understanding and interoperability. Future development and integration into natural language processing systems is needed.
Introduction
Electronic medical records contain rich clinical observations and patient history, often stored in the form of “natural language” clinical notes generated by providers and other healthcare workers. Many different types of clinical notes exist; one of the most recognized is the “history and physical” note (H&P) generated during hospital admissions and during some clinic visits. H&Ps and other forms of clinical documentation typically follow a common organization into segments, or sections, such as “chief complaint” or “history of present illness.” Many systems have been developed to understand natural language documents, often mapping them to standard terminologies to gain a greater “computable” understanding. While much work has been done in this area, few have formally studied sections in clinical documentation. We present the development of a section header terminology for H&Ps and a preliminary evaluation over a test set of H&Ps.
Recognizing sections in clinical notes improves understanding of clinical notes and can aid natural language processing (NLP) applications. For example, a diagnosis of “coronary disease” in a patient’s “past medical history” is very different than the same diagnosis found in “family medical history” section. Understanding document sections can also aid word-sense disambiguation: the acronym “BS” means “bowel sounds” in the “abdominal exam”, “breath sounds” in the “chest exam”, and “blood sugar” in the laboratory section. Finally, the United States Evaluation and Management Coding (“E&M coding”) system uses section counts to help calculate physician compensation.1
Currently available structured terminologies were not designed to represent the hierarchy or vast synonymy of sections possible in clinical notes. The Unified Medical Language System (UMLS) does contain many section titles through its component vocabularies, but only the Logical Observation Identifiers Names and Codes (LOINC) was explicitly designed to include note sections.2, 3 Originally developed to represent laboratory and radiology results, LOINC contains over 300 canonical terms and over 180 unique strings that represent section headers for notes and is a standard used by the HL7 Clinical Document Architecture (CDA).3, 4 However, LOINC does not provide extensive synonymy (e.g., “HPI” can be a synonym for “history present illness”) and supports only a few hierarchy levels (e.g., it does not represent that a “HEENT” exam contains an eye exam which contains a fundoscopic exam).
The Quick Medical Reference (QMR)® Knowledge Base, the result of more than 35 person-years of development effort to build an evidence-based diagnostic support engine, is also a useful resource for section header names.5 It contains more than 4,000 findings organized by a multilevel hierarchy of 525 sections. Since QMR’s hierarchy is based on the clinical findings used to construct the QMR Knowledge Base, its organization is dictated to the findings in its vocabulary instead of clinical practice.
Other vocabularies, such as SNOMED-CT or Medical Subject Headings (MeSH), also contain section headers but the have incomplete coverage and organization. For example, children of the concept “physical examination” in MeSH include a number of generic methods of examination such as “auscultation” and “palpation” combined with a only a few functional sections (“neurological examination” and “muscle strength”).6 SNOMED-CT includes a number of concepts relating to “cardiovascular exam” but not any subsections such as “cardiac auscultation” or “jugular venous pulse assessment.”7 The United States E&M Coding system defines about 40 sections used to assess note complexity.1
Methods
Section Header Terminology Creation
The goal of the newly developed section tagging terminology is to provide a list of concepts and synonyms that can function as both a reference and interface terminology8 for clinical section headings and their subsections. We defined a “section” as a clinically meaningful grouping of symptoms, history, findings, results, or clinical reasoning that is not itself part of the unique narrative for a patient. A valid clinical document section (segment) header includes words that provide context for the encapsulated text but whose words themselves do not add specific clinical information, such as a diagnosis or symptom. For example, “back pain” is not a valid section tag because it is the name of a symptom; furthermore, it may or may not originate in the back (e.g., a perinephric abscess and acute pancreatitis can present with back pain). The word “back,” however, in the phrase “back: pain on flexion” indicates the anatomical region and is a valid tag because it provides the location to the word “pain.” Similarly, a “past medical history” (a valid section) of “back pain” provides context and timing for the symptom “back pain.” Removal of the concept “past medical history” does not alter the presence or absence of any given finding in the note. Using these understandings as a guide, we sought to develop a terminology to adequately represent H&Ps, providing an initial step toward representing clinical documentation sections in general. We took a data-driven approach that seeks to model the sections that clinicians use in actual notes instead of modeling the “ideal H&P.”
The QMR findings hierarchy and LOINC were key enabling reference vocabularies for this project. The QMR vocabulary (as opposed to the knowledge base) has been declared public domain. We used the basic organizational structure of the QMR Findings Hierarchy for the initial hierarchical structure of the new section terminology, keeping approximately 150 patient history and physical headers and 160 laboratory, imaging, and pathological headers. We then revised the hierarchy by incorporating all relevant LOINC headers (approximately 155 unique strings), modifying the structure as appropriate. We expanded and revised the section hierarchy, concepts, and synonym lists based on the advice of clinicians and review of general and subspecialty clinical textbooks spanning eight decades.9–16 We obtained and incorporated the list of section terms created by Meystre and Haug.17 It contained 539 strings, many of which were mapped to LOINC terms and to a common “concept name.”
To revise the terminology based on actual clinical notes, we selected a random corpus of about 10,000 H&Ps from our EMR and all H&P and relevant outpatient visit templates (n=82) used in our template-based “notewriter.”18 In the Vanderbilt EMR, users can create H&Ps via several mechanisms: the notewriter, dictation (which may or may not be template-based, with variable templates), or, extremely rarely, hand-written notes. Users can also type documents without a template or upload documents written in Microsoft Word®.
We searched through the note templates and randomly-selected H&Ps for strings that appeared to be section titles. We processed these templates with a locally-developed section header tagging program and marked all strings that were possible section titles for review. Focusing on sensitivity, this list included any string containing at least one letter and less than 55 characters long (determined empirically) that (a) contained multiple capital letters anywhere in the sentence (including strings with boundaries between uppercase and nonuppercase characters such as “OP clear” or “TEMP 97”); (b) ended in a colon, dash, or period; or (c) matched a concept in the terminology after phrase and sentence filtering (see above). This process resulted in 1045 candidate section strings; of these, we added 301 new synonyms or concepts, refining the hierarchy where necessary. We also manually reviewed a number of subspecialty clinic notes to assure representation of detailed elements of the physical exam and past medical history that may pertain only to certain subspecialties, such as neuro-ophthalmology or rheumatology. The third step was evaluation of the then-current section header terminology against the training corpus of H&Ps extracted from the EMR. We wrote a program that processed the training set of documents using the same rules above to look for possible section strings not currently in the terminology. This resulted in 10,138 additional unique strings. We manually reviewed all tags with more than 20 occurrences in the document corpus, resulting in only 13 additions to the terminology.
The H&P terminology resulting from this process contained 1109 section header concepts and 4332 synonyms with a maximum tree depth of 10 levels. It is freely available in an XML format at http://knowledgemap.mc.vanderbilt.edu/research.
Terminology Schema
Following principles established by Cimino19 and Chute20, the section header terminology is concept-oriented and supports polyheirarchy, links to external vocabularies, concept and term attributes, and some assertional knowledge. Each section concept is distinct and has a unique numerical “concept identifier” (CID) and unique string name (the “concept name”) to which multiple strings may be mapped through unique “string identifiers” (SID) in a many-to-many fashion, much like the organization of the UMLS’s SUIs (for strings) and CUIs (for concepts) identifiers. The unique concept name is composed without spaces. For example, the concept “physical_examination,” whose CID is 545, is mapped to 34 strings, each with a unique SID.
Concepts are organized in a hierarchical structure with parent-child relationships. For instance, a “shoulder exam” is a child of “musculoskeletal exam.” Some concepts can have multiple parents. Each concept with multiple parents has a primary parent-child relationship and “alternate” relationship(s). When multiple anatomical categorizations are possible, we chose the nearest regional anatomic parent as the primary relation. For example, “jugular_venous_pulse_exam” is a child of both “neck_exam” and “cardiovascular_exam”; its primary parent is taken by the above heuristic to be “neck_exam” (Figure 1). We assigned relationships according to categorizations in textbooks or the medical literature.
Figure 1.
Partial diagram of the section terminology.
The red link from cardiovascular_exam to jugular_venous_pulse_exam is an “alternate” parent-child relationship; the primary parent is neck_exam.
To support interaction with other, potentially existing, applications, the section header terminology data model retains source terminologies identifiers (such as LOINC IDs or UMLS CUIs) so that one could restrict section matching to those concepts belonging to a given external terminology or use external concept identifiers (e.g., LOINC IDs) instead of the section terminology’s CIDs.
Specifying certain attributes for each concept and string can improve section header concept matching. String attributes include a string type (“concept name,” “preferred term,” “abbreviation or acronym,” “suppressible synonym,” “normalized,” and “normalized without stop words”). The latter two strings are generated strings to speed matching with a section tagger application. The “concept name” is the unique name of the concept in the terminology. Preferred terms are the most common of string type. Suppressible synonyms are strings that sometimes denote section headers but often should be ignored. Suppressible synonyms include single letter strings (e.g., “A” for the concept “Assessment”), common words that rarely represent specific sections (e.g., “patient” for “patient name”), and some abbreviations (e.g., “PT” for “prothrombin time,” which is often used to mean “patient” in the context of a note).
Section tags have two major attributes: a concept type and a data type. Concept types can be either “atomic” or “composite.” Composite concepts represent combinations of atomic concepts, such as “hematologic-lymphatic-oncologic.” We defined very common groupings, such as “head, eyes, ears, nose, throat” as atomic concepts, which contain as children concepts each component section. The data types refer to the type of text contained therein, such as “prose” (the default), “short”, “date/time”, “document title”, or “numeric.”
Section Header Evaluation
After creation of the terminology, we processed a portion of the training set of H&Ps (n=5885) with a locally-developed section tagging application, called SecTag. The SecTag system is designed to detect explicitly labeled sections and those inferred by context (e.g., “His mother has heart disease” implies presence of a “maternal medical history” section). The SecTag application uses a number of techniques to improve matching, such as string normalization, NLP techniques, and machine learning algorithms. We performed a separate evaluation using a test set of H&Ps not previously reviewed in the development of the terminology. The test set contained 1200 randomly selected general and subspecialty H&Ps. Eleven physician reviewers evaluated the coverage of the section header terminology on up to 60 H&Ps using a web-based interface which allowed them to mark each identified section as well as add sections not identified via the section tagging application. To help calculate interrater agreement, the reviewers scored the same first five documents and then every eighth document thereafter. The interrater agreement was good (Kappa = 0.70, p < 0.0001).
To distinguish between the key clinical sections and more minor ones (e.g., such as subsections of the plan or medical record number headers), we identified a priori a list of 29 “major sections,” which included common sections (e.g., “chief complaint,” “history of present illness,” “physical examination,” and their common subsections). We asked reviewers to identify all major sections, even if labeled only as a subsection. For example, if the section tagger application found “tobacco use history,” this counted as a major section if the parent “substance use history” was not found in that note. We also evaluated the both the string and concept coverage of LOINC and the E&M schema.
Results
Physician reviewers scored 319 unique H&Ps. The training and test set contained 631 and 355 unique sections and 248,520 and 16,196 total sections, respectively. In the test set, 7,969 (49%) of the sections were major sections (or a subsection of a major section without presence of the major section “parent” concept). Notes in the training set contained a median of 54 (interquartile range (IQR) 32–74) sections and subsections; notes in the test set were similar (p=0.11), with a median of 52 (IQR 31–70) sections and subsections. Of these sections, a median of 27 (IQR 20–30) were classified as “major sections.” The most common sections in the training set are listed in Table 1. Only nineteen sections appeared in more than half of the H&Ps.
Table 1.
The most common section headers. Bolded section headers were classified as “major sections.”
Section Header Name | H&P Frequency |
---|---|
physical_examination | 94.38% |
history_present_illness | 92.25% |
past_medical_history | 79.92% |
patient_name | 77.99% |
cardiovascular_exam | 76.41% |
chief_complaint | 76.01% |
personal_and_social_history | 73.17% |
pulmonary_exam | 71.21% |
attending_physician | 70.65% |
review_of_systems | 70.62% |
family_medical_history | 70.04% |
medications | 68.50% |
allergies_and_adverse_reactions | 68.11% |
neck_exam | 67.78% |
general_exam | 61.36% |
vital_signs | 56.01% |
neurological_exam | 54.68% |
medical_record_number | 53.71% |
Reviewers identified 160 (1.0%) sections that SecTag missed (omitted). Only 13 sections (0.09% of identified sections) were not found in the terminology with manual review: four laboratory or radiology findings, three plan subdivisions, and several alternative groupings of existing sections (e.g., “nose and ear exam” [without throat], facial exam as a separate component of head exam). One omitted section was a “major” section (a composite grouping that was not in the terminology - “neurological and musculoskeletal exam”).
In the test set, LOINC contained 85% of the concepts in the major sections and 77% of all sections tagged; however, only 20% of the document-labeled string headers matched after normalization (Table 2). The most common major sections missing from LOINC were family medical history entries for first-degree relatives and grouped physical exam subcomponents (e.g., “musculoskeletal and extremity exam”). LOINC also did not contain major sections that matched more granular document subsections (e.g., “jugular venous pulse” instead of “neck exam” or “cardiovascular exam”). The E&M coding schema represented 38% of all sections and 54% of the major sections. (Since the E&M coding schema is not a formal terminology, we did not report string matches since the terms can vary slightly between different references.)
Table 2.
LOINC and E&M Coverage in Identified Sections. LOINC= Logical Observation Identifiers Names and Codes; E&M = Evaluation and Management coding schema.
All sections | Major sections | |
---|---|---|
Section Terminology | 16,183 (99.9%) | 7,968 (100%) |
LOINC Concepts | 12,407 (76.6%) | 6,739 (84.6%) |
LOINC Strings | 3,246 (20.0%) | 2,535 (31.8%) |
E&M Concepts
|
8,075 (49.9%)
|
4,813 (60.4%)
|
Total Sections | 16,196 | 7,969 |
Discussion
The current study is one of the first large-scale efforts to formally evaluate a section header terminology for clinical notes. The section header terminology created in this study contained >99.9% of section header tags in H&Ps, and contained a greater number of section headers than LOINC. The H&Ps in this study contained a wide range of sections; review of the concepts by frequency revealed the clinically-meaningful clinical sections even in the most rarely matching sections. A clinician could use the terminology when generating notes with a structured documentation tool to allow for appropriate granularity and expressivity for his/her specialty. This terminology also may enable superior understanding of free-text clinical notes by coupling section identification with traditional NLP tools. Either use could help automate billing, improve decision support applications, and automate generation of problem lists17, among other uses.
The HL7 CDA provides a framework to represent and exchange electronic notes using a section terminology, such as LOINC (the current standard) or the one proposed.4 In this study, LOINC covered the majority of major sections concepts (using the synonymy present in our terminology) but had poor coverage of the actual document strings describing those concepts. In some cases, the LOINC match connoted a slightly different meaning than the document phrase; we mapped some terms from specialized components of LOINC to serve a more general purpose, such as “Treatment Plan” from a psychological corpus to the general “plan” section. Since LOINC is designed as a reference terminology, its synonymy is designed more to ease lookup than enable use in NLP applications. Extending LOINC with additional concepts and synonyms would improve its utility for NLP applications.
Since the terminology is hierarchical, it enables translation of detailed, granular concepts to more coarse terminologies such as the E&M coding schema or LOINC. This translation allows compatibility with existing systems possibly relying on other terminologies while allowing providers to use more specific sections that may be more appropriate for their practice. For example, an ophthalmologist could use the more specific section label “slit lens exam” (not present in LOINC or E&M), and an NLP system could translate this as the general “Eye Exam” section (present in both LOINC and the E&M schema).
This study must be interpreted in light of several limitations. We have not validated the terminology’s coverage on other document types or on documents from other institutions. While we designed the terminology to represent progress notes, consultation notes, and clinic notes in addition to H&Ps, these were not generally included in the test set. Currently, we are beginning to extend the terminology to contain sections from discharge summaries. Procedure (medical or surgical) and other note types may not be covered adequately at the current time.
Acknowledgments
This work was supported by National Library of Medicine grants T15 LM007450 and R01 LM007995 and the National Cancer Institute grant R21 CA116573. We appreciate the assistance of Drs. Meystre and Haug, who shared their list of section headers with us.
References
- 1.E/M History Criteria. (Accessed July 13, 2007, at http://www.fpnotebook.com/MAN3.htm)
- 2.UMLS Source Vocabularies. National Library of Medicine. (Accessed June 27, 2007, at http://www.nlm.nih.gov/research/umls/metab4.html)
- 3.Logical Observation Identifiers Names and Codes. (Accessed 6/19, 2007, at http://www.regenstrief.org/medinformatics/loinc/)
- 4.Dolin RH, Alschuler L, Beebe C, et al. The HL7 Clinical Document Architecture. J Am Med Inform Assoc. 2001;8(6):552–69. doi: 10.1136/jamia.2001.0080552. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Miller RA, Pople HE, Jr, Myers JD. Internist-1, an experimental computer-based diagnostic consultant for general internal medicine. The New England journal of medicine. 1982;307(8):468–76. doi: 10.1056/NEJM198208193070803. [DOI] [PubMed] [Google Scholar]
- 6.UMLS Knowledge Source Server. (Accessed July 3, 2007, at http://umlsks.nlm.nih.gov/kss/)
- 7.SNOMED® CT Browser. (Accessed July 3, 2007, at http://terminology.vetmed.vt.edu/SCT/menu.cfm)
- 8.Rosenbloom ST, Miller RA, Johnson KB, Elkin PL, Brown SH. Interface terminologies: facilitating direct entry of clinical data into electronic health record systems. J Am Med Inform Assoc. 2006;13(3):277–88. doi: 10.1197/jamia.M1957. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Norris GW, Landis HRM, Krumbhaar EB, Montgomery CM. Diseases of the chest and the principles of physical diagnosis. 5th ed. Philadelphia: W.B. Saunders company; 1933. [Google Scholar]
- 10.Wartenberg R. The examination of reflexes, a simplification. Chicago: The Year Book Publishers; 1945. [Google Scholar]
- 11.Burch GE. A primer of venous pressure. Philadelphia: Lea & Febiger; 1950. [Google Scholar]
- 12.Walker H. Physical diagnosis. St. Louis: Mosby; 1952. [Google Scholar]
- 13.Fowler NO. Physical diagnosis of heart disease. New York: Macmillan; 1962. [Google Scholar]
- 14.Martini P. Principles and practice of physical diagnosis. 3d ed. Philadelphia: Lippincott; 1962. [Google Scholar]
- 15.Perloff JK. Physical examination of the heart and circulation. Philadelphia: Saunders; 1982. [Google Scholar]
- 16.Swartz MH. Textbook of physical diagnosis : history and examination. 5th ed. Philadelphia: Saunders; 2006. [Google Scholar]
- 17.Meystre S, Haug PJ. Automation of a problem list using natural language processing. BMC medical informatics and decision making. 2005;5:30. doi: 10.1186/1472-6947-5-30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Rosenbloom ST, Grande J, Geissbuhler A, Miller RA. Experience in implementing inpatient clinical note capture via a provider order entry system. J Am Med Inform Assoc. 2004;11(4):310–5. doi: 10.1197/jamia.M1461. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Cimino JJ. Desiderata for controlled medical vocabularies in the twenty-first century. Methods of information in medicine. 1998;37(4–5):394–403. [PMC free article] [PubMed] [Google Scholar]
- 20.Chute CG, Cohn SP, Campbell JR. A framework for comprehensive health terminology systems in the United States: development guidelines, criteria for selection, and public policy implications. ANSI Healthcare Informatics Standards Board Vocabulary Working Group and the Computer-Based Patient Records Institute Working Group on Codes and Structures. J Am Med Inform Assoc. 1998;5(6):503–10. doi: 10.1136/jamia.1998.0050503. [DOI] [PMC free article] [PubMed] [Google Scholar]