a Schema of data integration and tumor cell identification. All cells were divided into normal and tumor cells using CopyKAT, then cells were integrated and clustered based on the expression profiles. b The cell number (left column), cell composition (middle column), and the mean number of circRNAs in each cell (right column). Samples were divided into normal and tumor cells according to the copy number variation. c Log-scaled circRNA expression values in normal and tumor cells from the primary (breast, n = 4687/8246 cells) and metastasis tumors (lymph, n = 1679/1465 cells and lung n = 614/1770 cells). Grey and red lines indicate normal and tumor cells, respectively. d Log-scaled circRNA expression values in each cell type, grey and red color indicates normal (n = 660 / 28 / 660 / 885 / 3339 / 140 / 205 / 970 / 93) and tumor (n = 29 / 2 / 108 / 36 / 9,630 / 185 / 792 / 533 / 167) cells. The error bars indicate ± SD of plotted values. e Log-scaled circRNA expression values divided by molecular subtypes. The x axis indicates molecular subtypes ranked from best to worst prognosis. Filled colors indicate normal (n = 518 / 1393 / 536 / 4533) and tumor (n = 852 / 1081 / 2926 / 6622) cells, respectively. The red points indicate the mean value of plotted data. f Trajectory reconstruction of all epithelial cells reveals two branches in tumor progression, colored by cluster results from the t-SNE plot. g GO enrichment analysis of 12 cell clusters (n = 149 / 239 / 62 / 143 / 89 / 165 / 209 / 200 / 119 / 183 / 159 / 126 cells) ordered by the EMT score. All clusters were divided into four stages according to the enriched biological processes. h The distribution of tumor and normal cells in t-SNE and trajectory projection plots (left), and cell composition in each cluster (right). i Change of circRNAs expression profiles in EMT cluster. The y axis represents log-scaled expression values of circRNAs, and size of points indicates the number of expressing cells. All center lines in the box plots and violin plots indicate the median values, and box limits indicate the upper and lower quartiles of plotted values. The upper and lower whiskers indicate the largest and smallest values within the range of 1.5x IQR from the box limits. **P < 0.01, ****P < 0.0001, Wilcoxon rank-sum test (two-sided). Source data are provided as a Source Data file.