Skip to main content
. 2024 Sep;34(9):1411–1420. doi: 10.1101/gr.279142.124

Figure 1.

Figure 1.

Schematic of regLM. (A,B) DNA sequences are prefixed with a sequence of prompt tokens representing functional labels. (C) A HyenaDNA model is trained or fine-tuned to perform next token prediction on the labeled sequences. (D) The trained model is prompted with a sequence of prompt tokens to generate sequences with desired properties. (E,F) A sequence-to-function regression model trained on the same data set is used to check and filter the generated sequences. (G) The regulatory content of generated sequences is evaluated.