Skip to main content
. 2020 Aug 17;21:207. doi: 10.1186/s13059-020-02091-3

Fig. 2.

Fig. 2

DiMSum error model estimates multiplicative and additive error sources in fitness scores. a Empirical variance of replicate fitness scores as a function of error estimates based on sequencing counts under Poisson assumptions in a deep mutational scan of TDP-43 (positions 290-331) [6]. Empirical variance (blue dots show average variance in equally spaced bins, error bars indicate avg. variance × (1 ± 2/ # variants per bin)) is over-dispersed compared to baseline expectation of variance being described by a Poisson distribution (black dashed line). The bimodality of the count-based error distribution results from the relatively low number of single nucleotide mutants which have high counts (thus low count-based error) and the many double nucleotide mutants which have low counts (thus higher count-based error). The DiMSum error model (red line) accurately captures the deviations of the empirical variance from Poisson expectation. Inset: bold cyan and magenta lines indicate multiplicative error term contributions to variance corresponding to input and output samples, respectively (dashed thin lines give input or output sample contributions to variance if multiplicative error terms were 1). The horizontal green line indicates the additive error term contribution. The red line indicates the full DiMSum error model. b The same as a but for a deep mutational scan of FOS [20] that shows more over-dispersion. cf Multiplicative (c, e) and additive (in s.d. units, d, f) error terms estimated by the error model on the two datasets. Dots give mean parameters, error bars 90% confidence intervals