Skip to main content
[Preprint]. 2024 Dec 2:2024.12.01.24318253. [Version 1] doi: 10.1101/2024.12.01.24318253

Figure 2: RAG-HPO extracts phenotypic information and returns HPO terms.

Figure 2:

RAG-HPO works in two phases, phenotype extraction and HPO assignment, to determine the appropriate HPO terms for the evaluated free clinical text. In the first phase, the clinical information is parsed to the LLM for extraction of clinical abnormalities (A). The extracted phrases are then vectorized and compared to the HPO vector database using semantic similarity search (B). In the second phase, the top 20 most similar phrases for each original extracted phrase are then returned to the LLM for assignment of HPO terms (C). Once all extracted phrases are analyzed, the list of HPO terms is returned to the user for verification and downstream analysis (D).