Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2022 Feb 21;2021:611–620.

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

©2021 AMIA - All rights reserved.

This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose

PMC Copyright notice

Figure 1: — Tokenization Encoding process. Training data gets processed and tokenized, forming the token-LOINC map. This map is then used to encode both the training and test data, leading to vectorized rows. The encoded row has an index for each LOINC that exists in the training data catalog (here only 2 LOINCs), and the value at each index represents how many tokens overlap with training data tokens mapped to that LOINC. The first row gets encoded as [3, 0] since there are 3 tokens in the source term that match a token for the LOINC 26464-8, and 0 tokens that match the LOINC 26499-4.