Figure 1:

The Assay2Mol workflow. A chemist provides a target description, which is used to retrieve BioAssays from the pre-embedded vector database. After filtering for relevance, the BioAssays are summarized by an LLM. The BioAssay ID is then used to retrieve experimental tables. The final molecule generation prompt is formed by combining the description, summarization, and selected test molecules with associated test outcomes, enabling the LLM to generate relevant active molecules. Icons are from Flaticon.com and svgrepo.com