Figure 5: Evidence of direct signaling-induced states in lung disease atlases.

(A) Schematic to infer signaling signatures in disease data by comparing matched cell states between disease and control samples. A signaling score is calculated using gene loadings from Fig. 3E for each cell state, and then compared by rank-sum testing with random downsampling of genes to ensure no single gene dominates the score. See Methods for detailed approach.
(B) List of disease datasets used for the analyses. Each dataset is assigned a color that is used to represent these datasets in the subsequent figure panels.
(C) Fold change increase in signaling program usage in disease versus control states across all diseases. Colors indicate dataset of origin for each cell state, as in B.
(D) Fold change expression in immune signaling programs in Covid-19 cells. Each point represents a cell state. The cell state annotations used are from the respective papers mentioned in B.
(E) Induction of TGFB1 signaling programs in cells identified as aberrant in the original papers, compared to other cells. Aberrant states include Krt17+/5−, aberrant basaloid, aberrant basaloid and ECM-high states in the 4 disease datasets respectively.
(F) Heatmap shows expression of top 20 genes of TGFB1–1 and TGFB2–1 programs in different states and datasets.