Skip to main content
. Author manuscript; available in PMC: 2021 Dec 22.
Published in final edited form as: Cell Syst. 2020 Jun 25;11(1):49–62.e16. doi: 10.1016/j.cels.2020.05.007

Figure 1. DEN Architecture.

Figure 1.

(A) A sequence produced by an input seed to a generative model (red-blue) shares the cost landscape with other generated sequences (orange). Patterns are penalized by similarity during training, resulting in an updated generator that transforms the red-blue seed into a different sequence, away from other patterns and potentially toward a new local minimum.

(B) Sequences are optimized on the basis of a pre-trained fitness predictor for some target function. We consider five different engineering applications: maximizing gene transcription, maximizing APA isoform selection, targeting 3′ cleavage positions, designing differential splicing between organisms, and improving green fluorescent proteins.

(C) In DENs, the generator is executed twice (independently) on two random seeds (z1, z2), producing two sequence patterns. One of the patterns is evaluated by the predictor, resulting in a fitness cost. The two patterns are also penalized on the basis of a similarity cost, and the generator is updated to minimize both costs.

(D) The PWM is multiplied by a mask (zeroing fixed nucleotides), and a template is added (encoding fixed letters). One-hot-coded patterns are outputted by sampling nucleotides from stochastic neurons, and gradients are propagated by straight-through estimation.

See also Figure S1.