Table 1.
Data type | Source vocabulary | Target terminology | Method |
---|---|---|---|
Demographics | EHR and ADT | HL7 v3 vocabulary for sex, race, ethnicity, vital status; ISO 639 for language | Manual mapping |
Encounters | EHR and ADT, or derived by TriNetX | HL7 v3 vocabulary for visit type (eg, inpatient, outpatient, ER) | Manual mapping |
Diagnoses |
|
ICD-10-CM |
|
Procedures |
|
|
|
Medications and Vaccinations |
|
RxNorm, OMOP extension of RxNorm, CVX Group codes | Semi-automated methods involving the use of external sources such as RxNorm “ApproximateTerm” API are utilized. For national catalogues of medications, TriNetX maps medications to RxNorm Ingredients + Route + Brand + Strength. |
Lab results, clinical findings, and vital signs | Local lab coding or LOINC | LOINC |
|
Genomics | Structured data from: molecular diagnostic labs (XML, JSON, CSV files), annotated VCF files, cancer registry data from NAACCR records | HGNC (gene symbols), HGVS (SNVs), ISCN (SVs, cytogenomic), Genomic Coordinates, rsID, LOINC (eg, IHC, MSI) |
|
Oncology |
|
ICD-O | NAACCR-based data sources (United States) are almost always linked to ICD-O, but other regions (eg, EMEA or Australia) frequently do not provide ICD-O data. However, when oncology data are provided using ICD-10-CM codes, additional mappings from ICD-10-CM to ICD-O topographies are applied. Additionally, when morphologies are not provided, some ICD-10-CM codes provide morphology information, enabling the derivation of ICD-O morphologies. |
Cross-domain mappings | selected HCPCS, SNOMED, and ICD-10-PCS codes | RxNorm | Data types are not homogeneous across regions, and some medications are frequently reported within procedures data sources (eg, CPT or OPS), so cross-domain mappings are also required to maximize the data coverage of TriNetX at a global scale. |
ADT: Admission Discharge Transfer; AEMPS: Agencia Española de Medicamentos y Productos Sanitarios; ATC: Anatomical Therapeutic Chemical; CPT: Current Procedural Terminology; CSV: Comma Separated Variable; DM + D: Dictionary of Medicines and Devices; EAN: European Article Numbering; HCPCS: Healthcare Common Procedure Coding System; HGNC: HUGO Gene Nomenclature Committee; HGVS: Human Genome Variation Society; HL-7: Health Level Seven; ICD-9: International Classification of Diseases, Ninth Revision; ICD-10-CM: International Classification of Diseases, Tenth Revision, Clinical Modification; ICD-10-GM: International Classification of Diseases, Tenth Revision, German Modification; ICD-10-PCS: International Classification of Diseases, Tenth Revision, Procedure Coding System; ICD-O-3: International Classification of Diseases for Oncology, third edition; IHC: Immunohistochemistry; ISCN: International System for Human Cytogenomic Nomenclature; JSON: JavaScript Object Notation; LOINC: Logical Observation Identifiers Names and Codes; MSI: Microsatellite instability; NAACCR: North American Association of Central Cancer Registries; NLP: natural language processing; NDC: National Drug Code; OPCS-4: OPCS Classification of Surgical Operations and Procedures (4th revision); OPS: Operationen- und Prozedurenschlüssel; rsID: Reference SNP cluster ID; SNOMED: Systematized Nomenclature of Medicine; SNV: single-nucleotide variants; SV: structural variant; VCF: Variant Call Format; XML: Extensible Markup Language.