Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2021 Jan 25;2020:1441–1450.

Normalizing Clinical Document Titles to LOINC Document Ontology: an Initial Study

Xu Zuo 1, Jianfu Li 1, Bo Zhao 1, Yujia Zhou 1, Xiao Dong 1, Jon Duke 2,8, Karthik Natarajan 3,8, George Hripcsak 3,8, Nigam Shah 4,8, Juan M Banda 5,8, Ruth Reeves 6,8, Timothy Miller 7,8, Hua Xu 1,8
PMCID: PMC8075502  PMID: 33936520

Abstract

The normalization of clinical documents is essential for health information management with the enormous amount of clinical documentation generated each year. The LOINC Document Ontology (DO) is a universal clinical document standard in a hierarchical structure. The objective of this study is to investigate the feasibility and generalizability of LOINC DO by mapping from clinical note titles across five institutions to five DO axes. We first developed an annotation framework based on the definition of LOINC DO axes and manually mapped 4,000 titles. Then we introduced a pre-trained deep learning model named Bidirectional Encoder Representations from Transformers (BERT) to enable automatic mapping from titles to LOINC DO axes. The results showed that the BERT-based automatic mapping achieved improved performance compared with the baseline model. By analyzing both manual annotations and predicted results, ambiguities in LOINC DO axes definition were discussed.

Introduction

Electronic health records (EHR) contain massive narrative data such as clinical notes, discharge summaries, lab reports and pathology reports, which often contain detailed patient status, treatment, and outcome information that is required for clinical and translation studies.1 The first step to utilize notes for clinical research is often to search and find the right types of notes needed for specific studies, which requires a standard and consistent naming convention for clinical documents. Nevertheless, different EHR vendors, institutions, and healthcare providers may follow different rules to name clinical documents, which makes it challenging to find the right documents within an institution or across multiple institutions2. Therefore, it is essential to standardize names of clinical notes, in order to optimize the search, retrieval, and management of clinical documents for research purposes.

There are efforts on developing ontologies3 to provide standard representations of clinical document types. For example, Logical Observation Identifiers Names and Codes (LOINC) has developed a Document Ontology (DO)4 to increase the interoperability across diverse EHR systems. It was first proposed and developed by Document Ontology Task Force5,6, then later extended by the LOINC Committee7. The LOINC DO follows the HL7 Clinical Document Architecture (CDA)8, a standard for defining clinical documents based on the document structure and semantic types. The current LOINC DO is a subset of LOINC codes that describes key attributes of clinical documents in five axes: Type of Service (ToS), Kind of Document (KoD), Setting, Role, and Subject Matter Domain (SMD). Each axis maintains a set of controlled vocabularies in a poly-hierarchical structure. Type of Service defines the kind of healthcare services (e.g., Consultation, Evaluation and Management, Procedure) provided to patients. Kind of Document specifies the type of clinical documents (e.g., Note, Report, Checklist) on the basis of its structure. Setting defines the location or channel (e.g., Ambulance, Birthing Center, Intensive Care Unit) where clinical care is provided. Role listed people and their occupations (e.g., Physicians, Nurse, Pharmacist) involved in the service or authors who created the clinical note. Subject Matter Domain specifies clinical specialty (e.g., Anesthesiology, Urology, Cardiovascular Disease) that is relevant to the document or the main purpose of creating the document. The latest version of LOINC DO had 151 values in Type of Service axis, 83 values in Kind of Document, 24 in Setting, 41 in Role and 222 in Subject Matter Domain.

A number of studies have investigated standardizing clinical documents using LOINC codes and/or Document Ontology. Hyun and Bakken9 retrieved section headers from nursing documents and 38% of the documents were successfully mapped to the LOINC semantical model without ambiguity. Hyun et al.10 extended the Subject Matter Domain (SMD) axis by combining the original DO list and value list from the American Board of Medical Specialties (ABMS). Chen et al.11 investigated the level of coverage and challenges of representing clinical documents with LOINC using document names from Fairview Health Services (FHS)12 and Fletcher Allen Health Care (FAHC). Li L et al.13 evaluated the generalizability of LOINC DO on clinical document types in the inpatient settings from New York Presbyterian Hospital and investigated the possibility of document exchange using LOINC codes. Wang et al.14 assessed the adequacy of LOINC DO in standardizing clinical documents extracted from a research clinical data repository and found that more vocabularies needed to be added into Role and Setting axes for non-ambiguous mapping. Rajamani et al. explore the possibilities to extend the Setting15 and Role16 axes of LOINC DO that incorporated less terminologies by combining current DO values and concepts from external sources. Beitia et al.17 compared anatomic terms from five standards and matched the 100 most frequent terms to LOINC CT codes. The results showed that LOINC codes ranked significantly higher than other standards in terms of the information exchange in radiology domain. Parr et al.18 developed a machine learning based pipeline that mapped noisy labels for laboratory test from the Department of Veterans Affairs Corporate Data Warehouse to LOINC codes and the method achieved a correction rate of 83%. Peng et al. manually mapped frequently used CT terms from 40 hospitals to LOINC codes and increase the coverage rate to 93% after creating 215 new LOINC CT terms.

Despite the progress in extending LOINC Document Ontology and manually analyzing mappings from local institutions to LOINC codes, there is no existing system that can automatically map document titles from any institution to LOINC DO axes. Here we propose an initial study to develop automated methods that can map a document title to LOINC DO axes, via an entity recognition approach. We define the task as identifying and classifying entities from document titles into five axes defined in LOINC DO. For example, given a note title “CH Primary Care-CHPCC Consult 088”, we attempt to recognize that the entity “CH Primary Care” belongs to the SMD axis; “CHPCC” that stands for Children’s Hospital Primary Care Center is a setting; and “Consult” is a Type of service (ToS). Named entity recognition (NER) has been extensively studied for clinical documents, with diverse types of approaches including rule-based methods, machine learning-based methods, and more recent deep learning-based algorithms19,20. For example, the Bidirectional Encoder Representations from Transformers (BERT)21, a pre-trained language model that can be fine-tuned on various downstream tasks, has been widely deployed in NER tasks including those in the medical domain and has shown significantly better performance, compared with conventional machine learning methods22.

The objectives of this study are to: (1) propose an annotation framework that identifies and assigns entities to LOINC DO axes; (2) implement different NER approaches including the recent deep learning-based NER algorithms for automatic mapping; and (3) evaluate the feasibility and generalizability of LOINC DO in clinical note title normalization. To achieve these goals, we collected 18,075 clinical document titles from five medical centers, developed different NER approaches to recognize DO entities, and carefully analyzed their performance across five institutions. Our best-performing NER system for DO entities achieved an F-measure of 0.9179, indicating the feasibility of this solution. We believe this study sheds the light for further development of automated systems to normalize note types to standard LOINC DO representation, thus facilitating more efficient search, retrieval, and usage of clinical documents in downstream applications.

Methods

Datasets

In this study we retrieved clinical document titles from the following five institutions: Boston Children’s Hospital (BCH), Vanderbilt University Medical Center (VUMC), Stanford University Medical School (SUMS), The University of Texas Health Science Center at Houston (UTHealth), and Columbia University Medical Center (CUMC). Different approaches were used to retrieve note titles from EHRs at different institutions. For example, the note titles of Boston Children’s Hospital are event code descriptions in PowerChart. The titles of Columbia were extracted from their clinical data warehouse for both inpatient and outpatient settings, including ancillary reports as well as clinical notes. UTHealth note titles were also extracted from clinical data warehouse of their outpatient practice group (UT Physicians). Note titles from Stanford were extracted from the EPIC Clarity tool in Stanford’s Clinical Data Warehouse. All duplicated note names were removed from the dataset in each institution.

Annotation Framework

We proposed and implemented the annotation framework by following the LOINC DO axis definitions. Entities rarely appeared as exact matches to LOINC DO values. Thus, it is vital to recognize synonyms and acronyms as qualifiers to those values. Table 1 includes LOINC DO values that appeared with top frequencies during our annotation process. All annotations were completed via the CLAMP (Clinical Language Annotation, Modeling and Processing) Toolkit23.

Table 1.

LOINC DO values and corresponding instances retrieved from note titles.

LOINC DO Values Description Examples
Type of Service (ToS)
Communication Exchanges of information between patients and doctors, or Comm, Internal Corres
Consultation Meeting between patients and doctors for medical advice Consult, visit, follow up
Evaluation Assessment of patients’ medical history and health conditions Re-Eval, Initial Eval, Assessment
History and Physical Information about health history and findings at time of admission H&P, Hist + Phys, Exposure/Travel History
Medical Equipment and Product medical, surgical and home medical products, equipment and supplies Throat Culture, Amb BP Monitor
Plan Schemes or outlines of administrating care or treatments Treatment Plan, Coordination Plan
Procedure Examinations for determining, measuring, or diagnosing a patient condition ECRP, Food Patch Testing, non Tunn cath, Abscess Drain
Kind of Document (KoD)
Note General narrative texts Note, Progress note, Record
Letter Documents issued by physicians, usually for describing patients’ health conditions Parent/School Letters
Report Results of medical examinations Final report, Rpt, test results
Consent Patients’ authorization or agreement on medical treatment Release of Information
Diagram General graphs or figures Growth Chart
Setting
Intensive Care Unit Locations concerning diagnosis and management of critical care NICU, MSICU
Outpatient Locations where patients receive treatments without admission Clinic, Office, OP
Role
Patient Reference to the patient New Patient, Estab Pat, Pt
Physician Reference to doctors of all professional levels MD, Consultant, Intern
Subject Matter Domain (SMD)
Anesthesiology Specialty concerning anesthesia and anesthetics Pain Treatment, ANEST
Dermatology Diseases or treatments concerning skin, hair and nails Atopic Dermatitis
Internal Medicine Diagnoses or treatments concerning general adult diseases Adolescent, Cardio, Endoc,
Medical Genetics Applications of Genetic study to medical care Sequencing
Neurology Disease or treatments concerning the nervous system NEURO, Neuromuscular
Radiology Diagnostic tests or treatments using medical imaging NM-CT Head, MR Neuro, XR-Spine, TOMOGRAPHY
Surgery Treatments of injury or disease with instruments SURG, Cardiac Surgery

One major difficulty in this annotation task is handling radiology reports, which account for a large portion of datasets from all five institutions. Typically, such report names are composed of four elements: imaging methods (e.g., X-ray, CT, MRI), anatomic locations (e.g., head, neck, spine), views (e.g., 1 view, Merchants, Lateral) and contrast (e.g., w/ Contrast, w/o Contr). These four elements are arranged in random orders and certain parts are not included in the titles occasionally. In this study we mainly aimed to identify imaging methods and anatomic location. Figure 1 included examples illustrating two common scenarios related to radiology report name annotation: (1) If these two elements are arranged consequently, then we annotate them together as one entity of SMD (e.g., “NM-CT” and “Head” were annotated as one entity in “NM-CT Head w/ Contrast”); (2) If the imaging methods are unspecified in titles, we do not identity anatomic locations, views or contrast as entities (e.g., “Feet - Both, 3 Views or More-Lex” was left unannotated).

Figure 1.

Figure 1.

Sample annotations of note types.

Another important decision that annotators will have to make is whether to combine two consecutive entities as one single entity. Two conditions need to be fulfilled before annotators combine two entities: (1) two entities are identified and classified into the same axis; (2) one entity can be considered as the modifier of another. Note that exceptions may occur if an exact match has been found in the LOINC DO hierarchy. For example, given the title “Procedure consent”, “Procedure” belongs to ToS and “consent” is a KoD. However, LOINC DO listed the exact term “Procedure note” as a single value under the KoD axis. Under such circumstances we conform to rules set by LOINC DO and annotate such titles as one single entity.

Model Architecture

In this study, we first developed a CRF-based NER model as a baseline, using the CLAMP tool23. Then we focused on fine-tuning the BERT model proposed by Devlin et al.21 for this task. We substituted the entities in an input title with BIO tags, where “B” indicates the beginning of an entity, “I” represents the following tokens within the entity and “O” represents other non-entity tokens. A [CLS] token was added to the beginning of each title for the classification tasks. Figure 2 shows an example of predicting the title “Estab Pat Visit Level 4”. The title first underwent a preprocessing step including boundary detection and tokenization. The word embeddings were then generated and used as the inputs of the BERT model. The output of the fine-tuning process was the final hidden vector of the [CLS] token, which represented the semantics of the whole title. If an entity was classified to a certain axis, the classification label of this axis is 1, otherwise the classification label is 0. The probability of classification label = 1 is calculated using a softmax function.

Figure 2.

Figure 2.

The architecture of fine-tuned BERT used in this study. “Tok” stands for the token inputs to the BERT model. In this example, “Estab” was labelled as “B-Role”, “Pat” was labelled as “I-Role”, “Visit” was recognized as “ToS”, “Level” and “4” are non-entity tokens.

Experiments and Evaluation Metrics

We randomly selected 800 note titles from each of the five institutions (i.e., 4,000 note titles in total) and three reviewers split the annotation task as 1600 (Columbia and UT Health), 1600 (Stanford and Vanderbilt), 800 (BCH) titles respectively. While mapping the titles to LOINC axis, the reviewers were asked to compare the entities with mapped LOINC DO values and rate the mapping in terms of the coverage. Ambiguous titles, especially those had entities with fuzzy matches were collected and discussed by all reviewers to ensure high annotation consistencies. For the automatic mapping, a total number of 4,000 annotated titles were randomly split into training, development and test sets with a ratio of 8:1:1. A series of experimented were also conducted within each institution in an attempt to explore the variance of model performances between different institutions. The evaluation results presented in the result section was calculated based on 10-fold cross validation. The evaluation metrics used in the experiments are listed below:

percision=turepositiveturepositive+falsepositive (1)
recall=turepositiveturepositive+falsenegative (2)
F1=2×presicion×recallprecision+recall (3)

The maximum sequence length of the BERT model is 128 and we kept default settings for other hyperparameters during the experiments.

Results

Statistics of the dataset

Table 2 shows the total number of note titles we retrieved from each institution. Titles of BCH, Columbia and Stanford are mainly combinations of key words that summaries the purpose and clinical specialties related to the document. Titles from UT Health were divided into three sections: numbers, document types and specialty. Titles from Vanderbilt are all recorded in capitals.

Table 2.

The number of titles collected from each institution along with examples.

Institution Name Number of Titles Examples
BCH 7400 Atopic Dermatitis New Patient Visit
Columbia 881 Respite Care - Outpt Record
UT Health 3232 No. Doc Type Specialty
358259 Chart Note Orthopedic Surgery
Stanford 3128 Outside Upper Ext US Interp
Vanderbilt 3434 HEPATOLOGY PHYSICIAN CONSULT

Figure 3 showed the tokens appearing most frequently in the word cloud. We found that the most frequently used tokens mainly from KoD and ToS axes. Although our titles included documents from both inpatient and outpatient settings, the word “clinic” that indicated the document was generated from outpatients appeared most of the time. Clinical specialties such as family medicine, pediatrics and anesthesia were also mentioned frequently in documents. Common healthcare services including admission, evaluation, communication and admission are also used in most note titles.

Figure 3.

Figure 3.

Word Cloud for terms and phrases that appeared with top frequencies in note titles.

Manual Annotation

Table 3 shows the distribution of entities for annotated titles, categorized by LOINC axes. Among the five axes, clinical notes are more likely to include information from ToS and SMD. Information relevant to Setting and Role are rarely included in clinical note titles. The numbers of entities recognized differ by institution, where Stanford had the least number of entities identified with while Vanderbilt had approximately 3 entities per title.

Table 3.

Entity distribution by axis for each institution. The total number of entities recognized and the number of entities per title were also displayed in the table.

Institution ToS KoD Setting Role SMD Total No. of Entities No. of Entities per title
BCH 26% 18% 13% 4% 39% 622 1.5550
Columbia 25% 34% 15% 4% 22% 850 2.1250
UT Health 13% 24% 1% 9% 53% 874 2.1850
Stanford 29% 15% 6% 4% 46% 560 1.4000
Vanderbilt 28% 29% 8% 8% 27% 1181 2.9525

Table 4 shows the coverage of each LOINC DO axis by institution. Most of the entities identified from all institutions could be mapped to a value from the LOINC DO hierarchy. The DO values of the Role axis showed an adequate coverage across five institutions. KoD was also found to have an adequate coverage of all headers annotated, as large portion of documents were mapped to DO values like “Note”, “Progress note” and “Report”. The ToS and SMD axes involved in most fuzzy matched entities, especially for titles from BCH and Stanford. Normally these entities are from radiology reports and more specific than the mapped LOINC DO values. For example, “NM-CT Head” can only be mapped to “Radiology” in the SMD axis, as there are no DO values specifying imaging methods in the current list.

Table 4.

LOINC DO Coverage by institutions, rated by three annotators for 4,000 titles.

Institution Matching Criteria ToS KoD Setting Role SMD
BCH Exact Match 47% 87% 90% 93% 42%
Fuzzy Match 51% 13% 10% 7% 55%
Not Covered 2% - - - 3%
Columbia Exact Match 67% 81% 86% 95% 41%
Fuzzy Match 30% 19% 14% 4% 55%
Not Covered 3% - - 1% 4%
UT Health Exact Match 91% 95% 94% 92% 87%
Fuzzy Match 9% 5% 6% 7% 13%
Not Covered 1% - - 1% -
Stanford Exact Match 53% 83% 72% 92% 48%
Fuzzy Match 44% 17% 28% 8% 48%
Not Covered 3% 2% - - 4%
Vanderbilt Exact Match 89% 86% 90% 95% 87%
Fuzzy Match 10% 14% 9% 4% 12%
Not Covered 1% - 1% 1% 1%

Automatic Mapping Performance

Table 5 shows the overall performance of the CRF and the BERT models trained and evaluated using a 10-fold cross validation on the 4,000 annotated note titles across five institutions. The results show that both the BERT and the baseline CRF model achieved high performances in KoD, Setting and Role axes. For ToS and SMD entities that are more difficult to recognize due to their high variances, BERT showed improved performances over the CRF model.

Table 5.

The overall performance of the BERT and baseline models trained from 4,000 titles across five institutions.

LOINC DO Axis Precision Recall F1
BERT CRF BERT CRF BERT CRF
ToS 0.7187 0.7880 0.7848 0.7270 0.7494 0.7120
KoD 0.9076 0.9110 0.9286 0.8930 0.9179 0.9020
Setting 0.8911 0.9190 0.9226 0.8940 0.9058 0.9060
Role 0.8810 0.9210 0.8837 0.8610 0.8811 0.8900
SMD 0.8153 0.8139 0.8434 0.7880 0.8290 0.8000

Table 6 shows the F1 scores of the BERT prediction by institution with comparisons of the baseline model performance. The results show that the performance varies heavily by institution, especially the ToS aixs, where the performances ranged from approximately 0.4 to 0.9. The fluctuation of performance in ToS is largely due to the high variances of entities in this axis. Many of the entities such as “Chemical Pleurodesis” and “Cholangiopancreatography” appeared only once in the annotated dataset of BCH. Therefore, the model failed to identify such terms as entities. On the contrary, performances in ToS for UT Health and Vanderbilt are higher as they contained fewer professional terms related to procedures. The BERT model also demonstrated significantly improved performance in recognizing SMD entities for all five institutions, like it showed in the overall performance.

Table 6.

The F1 scores of the BERT and baseline models by institution.

Institution ToS KoD Setting Role SMD
BERT CRF BERT CRF BERT CRF BERT CRF BERT CRF
BCH 0.4567 0.5030 0.8418 0.7010 0.8862 0.8480 0.7592 0.5780 0.7290 0.6470
Columbia 0.6533 0.6250 0.8860 0.8600 0.8823 0.9160 0.6957 0.6670 0.7234 0.6630
UT Health 0.8185 0.8130 0.9317 0.9340 0.9284 0.8000 0.9431 0.9420 0.9397 0.9240
Stanford 0.5657 0.6440 0.8983 0.8520 0.8326 0.7940 0.8256 0.8500 0.7284 0.6400
Vanderbilt 0.9165 0.9190 0.9679 0.9440 0.9544 0.9730 0.9450 0.9590 0.9487 0.9260

Discussion

In this study, we proposed and evaluated a novel NLP pipeline for mapping entities in clinical document titles to the five LOINC DO axes. This is the first attempt to automatically standardize clinical document types across institutions without accessing the document contents. From the statistical analysis we can learn that LOINC Document Ontology has a relatively high coverage over document titles and probably can be used as a universal standard for clinical note type normalization.

Our results reveal that overall BERT achieved relatively high performance in recognizing KoD, Setting and Role entities, which have less varieties and appeared repeatedly in a consistent form, such as “Note”, “Clinic” and “Patient”. For axes such as SMD that had a large number of entities with high varieties, BERT demonstrated significantly improved performances while the improvements were less visible for entities in Setting and Role, which only accounted for a small portion of all entities recognized. Furthermore, the performance of the automatic mapping differs by institution, as they often deployed diverse naming conventions for note names. It has significantly better performances when predicting titles from Vanderbilt and UT Health. This may due to the fact that titles from these two institutions were named based on a more rigorous rule that allowed entities from different axes to be arranged in a particular order. Of course, another reason for observed variation among institutions could be due to different methods used to collect titles at each institution. We plan to standardize the data collection process in our future study.

Although LOINC DO maintains a large number of terminologies that represent narrative document titles with precision, there exists a certain level of overlapping between axes, particularly for SMD and ToS axis. For example, a large number of radiologic examinations, such as CT, MRI and X-ray, can be interpreted from different perspective: (1) medical imaging methods (2) a lab test to diagnose or examine patients’ health conditions. Such titles can be classified as both ToS and SMD. Another example is the classification of surgeries and procedures. In LOINC DO “Surgery” belongs to the SMD axis whereas “Procedure” is considered as a Type of Service. This increased the difficulty to classify entities related to surgical procedures such as “Breast biopsy” and “Hysteroscopy,” as such entities fit into both axes. Such ambiguities resulted in inconsistencies between annotators and may contribute to the less satisfying performances of the BERT-based automatic mapping in the SMD and ToS axes. In order to improve LOINC DO as a document standard, more specific terms should be incorporated under “Radiology” from the SMD axis and “Procedure” from ToS. In addition, the DO list also has several redundant values that are rarely used for the mapping process. For example, there are 58 values related to “Compensation and Pension” in the ToS axis. However, this type of information rarely appeared in the datasets used in this study and could potentially be removed from the current list.

This study is only the first step to enable automatic mapping to LOINC DO as a method for clinical note title standardization. Ultimately, we aim to develop a multi-labeling system that can automatically assign exact LOINC DO values to clinical note titles, which further extended the study to a text classification task. When document content is available, we can further develop methods to infer document types from full-text notes. Another possible direction is to investigate different variants of the BERT model. In this study we used the original BERT large model pre-trained in the open domain. Accuracy of prediction may be further improved if we use pre-trained models trained on biomedical text such as BioBERT24 or ClinicalBERT25.

Conclusion

In this study, we developed and evaluated an NLP pipeline to identify and classify entities from clinical note titles to LOINC DO axes for document standardization. Our results show that the BERT-based automatic mapping achieved good performance over all LOINC DO axes compared with the baseline mode. It also improved performance for entities from ToS and SMD axes that have more ambiguities.

Acknowledgement

We would like to thank the OHDSI NLP Working Group community to provide active discuss about this task. This project is partially supported by grants NCI U24 CA194215, NIGMS 5U01TR002062, CPRIT RP170668 and CPRIT RP160015.

Conflicts of Interest

Dr. Xu and The University of Texas Health Science Center at Houston have research-related financial interests in Melax Technologies, Inc.

Figures & Table

References

  • 1.Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nature Reviews Genetics [Internet]. 2012 Jun;13(6):395–405. doi: 10.1038/nrg3208. [cited 2020 Mar 11]; Available from: http://www.nature.com/articles/nrg3208 . [DOI] [PubMed] [Google Scholar]
  • 2.Shapiro JS, Bakken S, Hyun S, Melton GB, Schlegel C, Johnson SB. Document ontology: supporting narrative documents in electronic health records. AMIA Annu Symp Proc. 2005;2005:684–688. [PMC free article] [PubMed] [Google Scholar]
  • 3.Amith M, He Z, Bian J, Lossio-Ventura JA, Tao C. Assessing the practice of biomedical ontology evaluation: Gaps and opportunities. JBiomedInform. 2018;80:1–13. doi: 10.1016/j.jbi.2018.02.010. doi:10.1016/j.jbi.2018.02.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.LOINC Document Ontology Available at: http://loinc.org/discussion-documents/document-ontology . Last accessed March 12 2020.
  • 5.Huff SM. Proposal for an Ontology for Exchange of Clinical Documents. Ann Arbor: Health Level Seven, Inc; 2020. chair. Document Ontology Task Force. [monograph on the Internet]. c1997-2005 [2000 July; cited March 12]. Available from: http://www.hl7.org/Special/dotf/docs/DocumentOntologyProposalJuly00 . doc. [Google Scholar]
  • 6.Document Ontology Task Force. [homepage on the Internet] Ann Arbor: Health Level Seven, Inc; 1997-2004. [cited 2020 Mar 12]. Available at: http://www.hl7.org/special/dotf/dotf.htm . [Google Scholar]
  • 7.Huff SM, Rocha RA, McDonald CJ, et al. Development of the Logical Observations Identifiers, Names, and Codes (LOINC) vocabulary. J Am Med Inform Assoc. 1998;5:276–92. doi: 10.1136/jamia.1998.0050276. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Dolin RH, Alschuler L, Boyer S, Beebe C, Behlen FM, Biron PV, et al. HL7 Clinical Document Architecture, Release 2. J Am Med Inform Assoc. 2006 Jan-Feb;13(1):30–9. doi: 10.1197/jamia.M1888. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Hyun S, Bakken S. Toward the creation of an ontology for nursing document sections: mapping section names to the LOINC semantic model. AMIA Annu Symp Proc. 2006;2006:364–368. [PMC free article] [PubMed] [Google Scholar]
  • 10.Hyun S, Shapiro JS, Melton G, et al. Iterative evaluation of the Health Level 7--Logical Observation Identifiers Names and Codes Clinical Document Ontology for representing clinical document names: a case report. J Am Med Inform Assoc. 2009;16(3):395–399. doi: 10.1197/jamia.M2821. doi:10.1197/jamia.M2821. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Chen ES, Melton GB, Engelstad ME, Sarkar IN. AMIA Annu Symp Proc. Vol. 2010. Published; 2010. Standardizing Clinical Document Names Using the HL7/LOINC Document Ontology and LOINC Codes; pp. 101–105. 2010 Nov 13. [PMC free article] [PubMed] [Google Scholar]
  • 12.Fairview Health Services Available at http://www.fairview.org/ . Last accessed March 12 2020.
  • 13.Li L, Morrey CP, Baorto D. Cross-mapping clinical notes between hospitals: an application of the LOINC Document Ontology. AMIA Annu Symp Proc. 2011;2011:777–783. [PMC free article] [PubMed] [Google Scholar]
  • 14.Wang Y, Pakhomov S, Dale JL, Chen ES, Melton GB. AMIA Jt Summits Transl Sci Proc. Vol. 2014. Published; 2014. Application of HL7/LOINC Document Ontology to a University-Affiliated Integrated Health System Research Clinical Data Repository; pp. 230–234. 2014 Apr 7. [PMC free article] [PubMed] [Google Scholar]
  • 15.Rajamani S, Chen ES, Wang Y, Melton GB. AMIA Annu Symp Proc. Vol. 2014. Published; 2014. Extending the HL7/LOINC Document Ontology Settings of Care; pp. 994–1001. 2014 Nov 14. [PMC free article] [PubMed] [Google Scholar]
  • 16.Rajamani S, Chen ES, Akre ME, Wang Y, Melton GB. Assessing the adequacy of the HL7/LOINC Document Ontology Role axis. J Am Med Inform Assoc. 2015;22(3):615–620. doi: 10.1136/amiajnl-2014-003100. doi:10.1136/amiajnl-2014-003100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Beitia AO, Lowry T, Vreeman DJ, et al. Standard Anatomic Terminologies: Comparison for Use in a Health Information Exchange-Based Prior Computed Tomography (CT) Alerting System. JMIR Med Inform. 2017;5(4):e49. doi: 10.2196/medinform.8765. Published 2017 Dec 14. doi:10.2196/medinform.8765. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Parr SK, Shotwell MS, Jeffery AD, Lasko TA, Matheny ME. Automated mapping of laboratory tests to LOINC codes using noisy labels in a national electronic health record system database. J Am Med Inform Assoc. 2018;25(10):1292–1300. doi: 10.1093/jamia/ocy110. doi: 10.1093/jamia/ocy110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wang Y, Wang L, Rastegar-Mojarad M, et al. Clinical information extraction applications: A literature review. J BiomedInform. 2018;77:34–49. doi: 10.1016/j.jbi.2017.11.011. doi:10.1016/j.jbi.2017.11.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wu S, Roberts K, Datta S, Du J, Ji Z, Si Y, Soni S, Wang Q, Wei Q, Xiang Y, Zhao B, Xu H. Deep learning in clinical natural language processing: a methodical review. J Am Med Inform Assoc. 2020;27(3):457–470. doi: 10.1093/jamia/ocz200. doi: 10.1093/jamia/ocz200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 2018. Oct 11,
  • 22.Si Y, Wang J, Xu H, Roberts K. Enhancing clinical concept extraction with contextual embeddings. Journal of the American Medical Informatics Association. 2019;26(11):1297–1304. doi: 10.1093/jamia/ocz096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Soysal E, Wang J, Jiang M, Wu Y, Pakhomov S, Liu H, Xu H. CLAMP-a toolkit for efficiently building customized clinical natural language processing pipelines. Journal of the American Medical Informatics Association. 2017 Nov 24;25(3):331–6. doi: 10.1093/jamia/ocx132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, Kang J. BioBERT: pre-trained biomedical language representation model for biomedical text mining. arXiv preprint arXiv:1901.08746. 2019. Jan 25, [DOI] [PMC free article] [PubMed]
  • 25.Huang K, Altosaar J, Ranganath R. ClinicalBERT: modeling clinical notes and predicting hospital readmission. arXiv preprint arXiv:1904.05342. 2019. Apr 11,

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES