Skip to main content
. 2023 Aug 12;24:311. doi: 10.1186/s12859-023-05424-8

Fig. 3.

Fig. 3

Heterogeneity is additively decomposable. The heterogeneity of a population of cells (5 cells in this illustration) with respect to the expression of a gene g, I(g), can be decomposed into inter- and intra-cluster heterogeneities for any proposed clustering, S (here, two subpopulations, or clusters, of 3 yellow and 2 purple cells). The inter-cluster heterogeneity HS(g) is determined by independently aggregating all transcripts (shown as horizontal lines) associated with each sub-population in S and then taking the KLD of the resulting distribution from the uniform distribution of the transcripts over C clusters. It measures the extent to which transcripts are uniformly assigned to clusters. The intra-cluster heterogeneity hS(g) is determined by taking the weighted sum (with respect to the number of transcripts on each subpopulation) of the heterogeneities of each of the constituent subpopulations, considered independently. It represents the average heterogeneity of the proposed clusters, accounting for disparities in number of transcripts assigned. In this toy example, the overall population heterogeneity of gene g, I(g)=0.55, decomposes as the sum of the inter-cluster heterogeneity HS(g)=0.33, plus the intra-cluster heterogeneity hS(g)=0.22. The latter is obtained as the weighted sum (with respect to the number of transcripts in each cluster, here 2/10=0.2 and 8/10=0.8) of the heterogeneities on each subpopulation. Further details and formulae are provided in the “Methods” Section