Skip to main content
. 2019 Oct 31;10:1078. doi: 10.3389/fgene.2019.01078

Figure 1.

Figure 1

Data separation into training and validation subsets. A single reporter is shown, the scheme was identical for all reporters. Yellow bars: training subset, brown bars: validation subset. (A) Original CAGI setup. For each reporter, the training subset of single-nucleotide variants (SNVs) (25% from total) consists of multiple 16bp blocks spanning over neighboring reporter coordinates. (B) Continuous blocks covering 25% of reporter length for each reporter with a varying shift from the reporter 5’ end. (C) Training data with varying block lengths from 1 to 64bps.