Skip to main content
[Preprint]. 2024 Dec 2:2024.12.01.24318253. [Version 1] doi: 10.1101/2024.12.01.24318253

Figure 1: UMAP representation of vector database storage of the Human Phenotype Ontology.

Figure 1:

The HPO database was downloaded, and key information was extracted and restructured for vectorization, including the HPO ID, Title, Definition, Comments, and Synonyms. Additional phrases were created using LLMs and validated using current HPO analysis tools to increase the robustness of the database. The current database contains 54,000 phrases that map to specific HPO IDs.