Inner–Outer
Code. Encoding. The original information
is first encoded with an outer code that
introduces redundancy and protects against the loss of sequences.
In Grass et al.4 the original information
was first grouped into blocks of multiple sequences (light blue).
Then, each row was encoded with a Reed–Solomon code that adds
redundancy (yellow). The columns correspond to single DNA sequences.
These are labeled with a unique index (purple). Each column is then
encoded with an inner code that adds logical redundancy on the level
of each sequence (green). In general, the inner and outer codes need
not add the redundancy separate from the original data, but instead
return a modified longer word. Decoding. The original information from the set of noisy sequences (errors
marked in red) is retrieved by first decoding the inner code. This
removes most errors within the sequences. For large error rates dominated
by insertions and deletions, this step may be preceded by a clustering
and alignment step that generates sequences with fewer errors from
multiple noisy copies. The sequences are ordered by their index. The
ordered sequences are then decoded by the outer code. Here, lost sequences
correspond to erasures and erroneous sequences to substitutions. These
are corrected by the outer code.