Skip to main content
. 2021 Mar 24;13:45. doi: 10.1186/s13073-021-00842-w

Fig. 2.

Fig. 2

The graphical model representation of CACTUS. Circle nodes are labeled with random variables in the model. Arrows correspond to local conditional probability distributions of the child variables given the parent variables. Observed variables are shown as grayed nodes. Double-circled nodes are deterministically obtained from their parent variables. Small filled circles correspond to hyperparameters. Ci,k denotes the true (corrected) genotype of clone k at variant position i. Ωi,k denotes the input clone genotypes, with Ωi,k=1 if the mutation i is present in clone k and 0 otherwise. Gj,q denotes the distance of the cell j to cluster q, computed based on the input clustering of cells. Tj=q indicates that cell j is in cluster q. pj,q is interpreted as the success probability for cell j to switch to cluster q. Ai,j denotes the observed count of unique transcripts with alternative (mutated) nucleotide mapped to position i in cell j. Di,j denotes the total unique transcripts count mapped to that position in that cell. Iq=k represents the assignment of cluster q to clone k. θi denotes the success probability of observing a transcript with the alternative nucleotide at a position i in a cell that carries this mutation, and θ0 the success probability of observing a transcript with the alternative nucleotide in a position that is not present in the cell. ξ is the error rate for the genotypes. {ν0,ν1,κ} constitutes the set of hyperparameters in the model