Fig. 4.
Intensity of interest in heterogeneity. The two sequences stand for an SNV at the second bases, G/C. When genomic DNA of an individual who is heterozygous at this SNV is sequenced for genetic variants with NGS, a fraction of short reads with C at the SNV site vary around 0.5, as indicated by the black line. When the depth is lower, the distribution is fatter. When the number of reads with C is 13 out of 30, it is reasonable to refer to this individual as heterogeneous at this SNV. However, when the number of reads with C is 3 out of 30, it is safe not to believe that this individual is heterozygous. In this case, the researchers are interested in the fraction around 0.5 and 3 out of 30 is considered as a noise. When a cancer researcher who is interested in the fraction of cancer cells that is heterozygous at a site, the cancer cells are sequenced and the fraction is determined to be 10%. The researcher does not ignore this finding because a small fraction of cancer cells has the C allele, and the expected distribution of this fraction may take the distribution indicated by the blue line