Distribution of mismatches to the reference genome on the reads after alignment. (Left) First data set; (right) second data set. The curves are largely flat, indicating an even distribution of mismatches over the reads, apart for a mild increase towards the edges of the reads, possibly due to reads containing insertions and deletions. We kept only data coming from the flat region of the curve, i.e., nucleotides 5 to 45 in each aligned read.