Skip to main content
. Author manuscript; available in PMC: 2026 Feb 20.
Published in final edited form as: Proc Conf Empir Methods Nat Lang Process. 2025 Nov;2025:27337–27362. doi: 10.18653/v1/2025.emnlp-main.1390

Figure 1:

Figure 1:

The Assay2Mol workflow. A chemist provides a target description, which is used to retrieve BioAssays from the pre-embedded vector database. After filtering for relevance, the BioAssays are summarized by an LLM. The BioAssay ID is then used to retrieve experimental tables. The final molecule generation prompt is formed by combining the description, summarization, and selected test molecules with associated test outcomes, enabling the LLM to generate relevant active molecules. Icons are from Flaticon.com and svgrepo.com