Skip to main content
. Author manuscript; available in PMC: 2020 Nov 18.
Published in final edited form as: Ann Appl Stat. 2019 Jun 17;13(2):1268–1294. doi: 10.1214/18-aoas1233

Fig 1.

Fig 1.

An example of how feature vectors are generated: if we believe that the mutation rate at a position depends on the 4-mer (i.e. length 4 motif) starting one position to its left, then the feature vector for position j is a one-hot encoding of the sequence that appears in position j − 1 through j + 2. More formally, each element in the feature vector at position j indicates whether or not a motif m appears from start position j − j′ +1 through end position j − j′ + len(m) (here m = 4 and j′ = 2). The start and end positions are derived by aligning position j of the sequence with position j′ of the motif.