Table 1.
Glossary of terms.
| Term | Description |
|---|---|
| Annotation code | Code written in Python 3 (version 3.7), an open-source scripting language, to match extracted end points to terms in the controlled vocabulary crosswalk. |
| Combination word | A word used in developmental toxicology treatment end points that contains both a localization within the body (e.g., head) and an observation description (e.g., small) in their definition (e.g., microcephaly). One of four types of words found in the user-defined look up lists (defined below). |
| Controlled vocabulary | An authoritative set of terms selected and defined based on the requirements set out by the user group. Used to ensure consistent indexing (human or automated) or description of data or information. Controlled vocabularies do not necessarily have any structure or relationships between terms within the list.29,30 |
| Crosswalk | A spreadsheet file that includes all three controlled vocabularies and shows how terms in each vocabulary were matched to each other according to the rules described in Table 4. |
| Crosswalk compatible | A term that did not get automatically mapped by the annotation code and to which only one UMLS term was applied manually; only one DevTox and only one OECD term are associated with that UMLS term in the crosswalk and are pulled in by default. If a term is not crosswalk compatible, then the appropriate DevTox and OECD terms are manually applied. |
| Concept unique identifiers (CUIs) | Unique identifiers assigned to concepts in UMLS (e.g., “C0015392” is the CUI for “eye”). A concept represents a single meaning and contains all words/phrases that express that meaning. |
| Data frame | A two-dimensional data structure with rows and columns (often used in Python’s most common data analysis and manipulation tools). |
| Developmental toxicology | The study of the potential for substances to cause birth defects and other signs of toxicity during embryo-fetal development.31 |
| Harmonized vocabulary | Combination of multiple languages into a single comparable view building from the components of each.32 |
| Localization word | A word used in developmental toxicology treatment end points that describes a place within the body (e.g., head). |
| Natural language processing | A branch of computer science and linguistics that focuses on enabling computers to read, understand, process, and analyze human language. |
| Observation word | A word used in developmental toxicology treatment end points that describes a state of a body part (e.g., the word “small” in “small head”). |
| Ontology | A formal representation of a body of knowledge within a given domain, in a computer-readable format. Ontologies usually comprise a set of terms or concepts with relations that operate between them.30 |
| Predictive toxicology | Multidisciplinary approach to toxicology that uses innovative approaches to predict human-relevant health effects from exposure to substances. |
| Root words | Words where variations on that word, such as adjectives and plurals, are also expected to be relevant (e.g., duplicat*) |
| Standardized language | Language that uses a common set of terms across datasets or resources. |
| Targeted review | Purposeful reviewing of specific end points known to be more susceptible to errors (e.g., end points related to “small” or “large”). |
| Unique word | A word used in developmental toxicology treatment end point whose meaning is not made up of a localization and observation but is rather a stand-alone concept (e.g., “mortality”). |
| User-defined look-up lists | A collection of common words used in developmental toxicology treatment end points, linked with associated words [e.g., “retina” and “eye” (linked together) and “non-live” and “dead” (linked together)]. There are 4 lists: Localization list, Observation list, Combination list, and Unique words list (defined separately). |
| Whole words | Words within the user-defined look up lists that are expected to stand alone and only be relevant with the current word form (e.g., large), as opposed to root words (defined above). |
Note: OECD, Organization for Economic Cooperation and Development; UMLS, Unified Medical Language System.