Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Oct 1.
Published in final edited form as: Cell Stem Cell. 2020 Sep 4;27(4):663–678.e8. doi: 10.1016/j.stem.2020.07.022

Organoids model transcriptional hallmarks of oncogenic KRAS activation in lung epithelial progenitor cells

Antonella F M Dost 1,14, Aaron L Moye 1,14, Marall Vedaie 2,3, Linh M Tran 4, Eileen Fung 5, Dar Heinze 2,6, Carlos Villacorta-Martin 2, Jessie Huang 2,3, Ryan Hekman 7, Julian H Kwan 7, Benjamin C Blum 7, Sharon M Louie 1, Samuel P Rowbotham 1, Julio Sainz de Aja 1, Mary E Piper 8, Preetida J Bhetariya 1,8, Roderick T Bronson 9, Andrew Emili 7,10, Gustavo Mostoslavsky 2,6, Gregory A Fishbein 11, William D Wallace 11,12, Kostyantyn Krysan 4, Steven M Dubinett 4,13, Jane Yanagawa 5,13,*, Darrell N Kotton 2,3,*, Carla F Kim 1,15,*
PMCID: PMC7541765  NIHMSID: NIHMS1619122  PMID: 32891189

Summary

Mutant KRAS is a common driver in epithelial cancers. Nevertheless, molecular changes occurring early after activation of oncogenic KRAS in epithelial cells remain poorly understood. We compared transcriptional changes at single cell resolution after KRAS activation in four sample sets. In addition to patient samples and genetically engineered mouse models, we developed organoid systems from primary mouse and human induced pluripotent stem cell derived lung epithelial cells to model early-stage lung adenocarcinoma. In all four settings, alveolar epithelial progenitor (AT2) cells expressing oncogenic KRAS had reduced expression of mature lineage identity genes. These findings demonstrate the utility of our in vitro organoid approaches for uncovering the early consequences of oncogenic KRAS expression. This resource provides an extensive collection of data sets and describes organoid tools to study the transcriptional and proteomic changes that distinguish normal epithelial progenitor cells from early-stage lung cancer, facilitating the search for targets for KRAS-driven tumors.

Keywords: Organoid, early stage lung cancer, stage IA lung adenocarcinoma, single cell RNA, Sequencing, KRAS, loss of differentiation, tumor progression, developmental programs, iPSC, alveolar

eTOC Blurb

Early-stage lung cancer is poorly understood. Here, the authors introduce new organoid systems to model lung cancer. KRAS-expressing alveolar progenitor cells had reduced expression of lineage genes in mouse and organoid models and stage IA cancers. This is the first report of loss of differentiation in early-stage lung cancer.

Graphical Abstract

graphic file with name nihms-1619122-f0007.jpg

Introduction

KRAS is one of the most frequently mutated oncogenes across epithelial cancers. Limited understanding of the biology of KRAS and its downstream effectors in epithelial cells likely contributes to the limited therapeutic targets for KRAS mutant cancers. Oncogenic KRAS is associated with poor prognosis and therapy resistance (Haigis, 2017). Tumor cell lines experiments revealed that the RAF/MAPK and PI3K/AKT pathways are activated upon overexpression of oncogenic KRAS, yet pathway activation is distinct when oncogenic KRAS is expressed at physiological levels from its endogenous promoter (Tuveson et al., 2004; Zhu et al., 2014).

Oncogenic KRAS mutations are driving events in lung cancer and present in 30% of lung adenocarcinomas (LUAD) (Collisson et al., 2014). Furthermore, expression of oncogenic KRASG12D is sufficient to initiate LUAD in genetically engineered mouse models (GEMMs) (Jackson et al., 2001). Despite the significant impact of KRAS mutations in lung cancer, the effect that oncogenic KRAS has on epithelial cells shortly after its activation besides initiation of proliferation has not been explored.

Recent advances in technologies such as single cell RNA-Sequencing (scRNA-Seq) and organoids make it now possible to study transcriptional changes that follow oncogenic KRAS activation with single cell resolution in a controlled environment. Previously published lung tumor organoids were derived from tumor cell lines or from tumors (Kaisani et al., 2014; Kim et al., 2019; Sachs et al., 2019), and therefore do not model the events in early-stage tumorigenesis. Efforts have been made to model all stages of cancer progression with organoids in non-lung tissues (Drost et al., 2015; Li et al., 2014; Matano et al., 2015; Seino et al., 2018). We previously demonstrated that primary murine lung progenitor cells survive in vitro activation of oncogenic KRAS in organoid cultures (Zhang et al., 2017a). However, the specific effect of oncogenic KRAS on transcriptional states was not studied in any of these reports.

To facilitate the study of oncogenic KRAS-induced changes, we analyzed data from an early-stage KrasG12D GEMM, in vitro induced KrasG12D AT2-derived murine lung organoids, in vitro induced KRASG12D human lung organoids derived from induced pluripotent stem cells (iPSCs), and lesions from stage IA LUAD patients, all at single cell resolution. Characterization of the data revealed that reductions in AT2 lineage marker gene expression is an early consequence of oncogenic KRAS. Our organoid systems provide tools to rapidly and accurately model LUAD progression in vitro, and our datasets provide a useful resource for the cancer research community.

Results

ScRNA-Seq of distal lung epithelium reveals distinct transcriptional clusters of KRASG12D activated cells during early tumorigenesis

We used scRNA-Seq to define transcriptional changes in distal epithelial cell populations during early stage LUAD in the KrasLSL-G12D; Rosa26LSL-YFP (henceforth, KY) LUAD GEMM (Jackson et al., 2001). KY mice were infected with an adenovirus 5 vector containing Cre-recombinase driven by the ubiquitous CMV promoter (Ad5-CMV-Cre) (Figure 1A). After 7 weeks we observed small clusters of YFP+ cells consistent with atypical adenomatous hyperplasia (Figure S1A). Viable, recombined (CD31/CD45 /EPCAM+/YFP+; henceforth, YFP+) and non-recombined (CD31/CD45/EPCAM+/YFP; henceforth, YFP) epithelial cells were collected using fluorescence activated cell sorting (FACS) (Figure S1B). We used 10X Genomics scRNA-Seq to examine gene expression during early-stage LUAD and analyzed the data using ScanPy (Wolf et al., 2018). After pre-processing we focused our attention on clusters containing more than 100 cells, leaving four clusters for further analysis (Figure 1B, S1C, and S1D; see STAR methods). Cluster 1 (C1) was comprised primarily of YFP+ cells and cluster 0 (C0) of YFP cells, while cluster 2 (C2) and cluster 3 (C3) had equivalent contributions from YFP+ and YFP cells (Figure 1C and 1D). Expression of AT2 markers Sftpc and Lyz2 was highest in C0 and C1, ciliated cell markers Foxj1 and Cd24a in C2, and club cell markers Scgb1a1 and Scgb3a2 in C3 (Figure 1E). While both YFP+ and YFP cells were present in C2 and C3, only C0 and C1 with elevated AT2 marker expression formed transcriptionally distinct YFP and YFP+ clusters (Figure 1B, 1C, and 1D). Correlation analysis between all clusters revealed that C0 and C1 share some degree of similarity, while C2 and C3 were more distinct (Figure S1E).

Figure 1. ScRNA-Seq of distal lung epithelium reveals distinct transcriptional clusters of KRASG12D activated cells during early tumorigenesis. See also Figure S1.

Figure 1.

(A) Experimental strategy to analyze epithelial populations during early-stage LUAD in vivo using scRNA-Seq.

(B) (C) Clustering of transcriptomes using UMAP. Cells are colored based on (B) Louvain clusters or (C) Batch ID.

(D) Batch contributions to each Louvain cluster with number of cells indicated.

(E) Log expression of lung epithelial cell marker genes in each Louvain cluster.

(F)(G)(H)(I) Z-scores of indicated signatures in Louvain clusters 0 and 1. Dashed line marks median of reference sample.

AT2 cells have previously been proposed as the LUAD cell of origin (Lin et al., 2012; Xu et al., 2012) and were the only lung epithelial cell type that formed a transcriptionally distinct cluster upon KRASG12D expression. Hence, we focused our studies on the consequences of KRAS activation in AT2 cells. To test if the transcriptional changes in YFP+ C1 agree with previously published data, we calculated z-scores using gene signatures we expected to be elevated in YFP+ C1. Consistent with published observations, KRAS and NF-kappaB target gene signatures were elevated in C1 cells, as was a proliferation signature, indicating that the cluster is transcriptionally primed to proliferate (Figure 1F, 1G, 1H; table S1) (Barbie et al., 2009; Bild et al., 2006; Meylan et al., 2009; Travaglini et al., 2019).

Next, we performed differential expression (DE) analysis to identify genes, transcription factors and cofactors (TF/TFCs) that define C0 and C1 (Figure S1F, S1G; table S1 and S2). We found that the lung fate TF Nkx2–1 the AT2 identity TF Etv5 were enriched in C0 (Morrisey and Hogan, 2010; Zhang et al., 2017b). In contrast, the proto-oncogene Myc (Chen et al., 2018; Dang, 2012; Poole and van Riggelen, 2017) and Id1, a TF shown to promote non-small cell lung cancer (NSCLC) cell proliferation and metastasis, were upregulated in C1 (Antonangelo et al., 2016; Cheng et al., 2011; Pillai et al., 2011). Moreover, Foxq1, a TF found to be increased in NSCLC tumor tissue compared to paired adjacent tissue was elevated (Li et al., 2020), Etv4, a TF expressed during lung development (Herriges et al., 2015), and Klf4, important for inducing pluripotency in cells (Takahashi and Yamanaka, 2006), had elevated expression in C1. Hence, upon KRASG12D expression, AT2 cells downregulate TF/TFCs that maintain AT2 identity, while factors known to promote cancer growth, important for developmental processes, and induce pluripotency have increased expression. We tested if the expression of these TF/TFCs correlated with a transition to a less differentiated state as often observed in late stage cancers. Indeed, a signature consisting of 46 murine AT2 marker genes (Franzén et al., 2019) was significantly lower in C1 compared to the C0 (Figure 1I; table S1).

It was recently shown that primary human LUAD contains cells that express multiple lineage-specific signatures (Laughney et al., 2020). Therefore, we looked for “lineage infidelity” in our early-stage GEMM data. We found that C1 had a lower expression of the AT2 markers Sftpc, Lyz2, and Etv5, consistent with a loss of AT2 identity. Strikingly, alveolar type 1 (AT1) markers Aqp5 and Pdpn, and club cell markers Scgb1a1 and Scgb3a2 were upregulated, indicating a transcriptional priming for other lung epithelial cell types. Furthermore, Ly6a (SCA1), a marker of lung stem cells in mice (Kim et al., 2005) and tumor propagating cells in the KP lung cancer model (Curtis et al., 2010) was also upregulated in some of the C1 cells (Figure S1H).

Finally, we performed Gene Ontology (GO) analysis on differentially expressed genes in C0 and C1 to identify pathways altered in AT2 cells after KRAS activation. In total we found 8 common, 73 CO-specific and 160 Cl-specific enriched GO pathways (Figure S1I; table S2). Unique terms in C1 included “NIK/NF-kappaB signaling”, consistent with our finding (Figure 1G), and terms that indicate upregulated ribosome biogenesis and translation. Unique terms in CO included cholesterol, alcohol, and lipid metabolism pathways, suggesting that these processes have an essential role in AT2 biology.

Inducible organoids rapidly recapitulate in vivo tumor progression and form tumors upon transplantation

To better understand transcriptional programs that follow KRASG12D activation, we developed an in vitro organoid system that allowed us to rapidly model changes in primary lung AT2 cells shortly after induction of oncogenic KRAS. We hypothesized that KrasG12D activation alone mimics an early tumor stage phenotype, while the additional loss of the tumor suppressor Tp53 models a more advanced stage, as is the case in GEMMs (Jackson et al., 2001, 2005). We generated organoids by dissecting lungs of adult KY, KrasLSL-G12D/+; p53fl/fl; Rosa26LSL-YFP (KPY), and Rosa26LSL-YFP(Y) control mice and used FACS to isolate AT2 cells (CD45/CD31/EPCAM+/SCA1) (Kim et al., 2005; Lee et al., 2014) (Figure 2A). Cells were infected with Ad5-CMV-Cre (CRE) virus in vitro and cultured with stromal cells in our 3D organoid air-liquid interface (ALI) co-culturing system described previously (Lee et al., 2014, 2017). Upon Cre-expression, almost all organoids were YFP positive, suggesting a high Cre-induction efficiency (Figure 2B).

Figure 2. Inducible organoids rapidly recapitulate in vivo tumor progression and form tumors upon transplantation. See also Figure S2.

Figure 2.

(A) Experimental strategy to grow air liquid interphase (ALI) organoid cultures in growth factor reduced (GFR) Matrigel.

(B) Representative whole-well brightfield (BF) and YFP-channel images of organoid cultures. Images were stitched together to show whole wells.

(C) Representative H+E stained organoid slides. Arrows: pleomorphic cells. Arrowheads: giant, multinucleated cells. Scale bar = 25 μm.

(D)(E) Quantification of KI67+ cells per organoid on (D) day 7 and (E) day 14 of organoid culture based on IF staining. Each dot represents one organoid.

(F, G, H) H+E staining of mouse lungs that were transplanted with organoid-derived cells. Scale bar lower magnification = 100 μm. Scale bar higher magnification = 25 um.

P-values were determined using the Mann-Whitney rank test. n.s.=p≥0.05, *=p<0.05, **=p<0.005, ***=p<0.0005.

Histological analysis revealed that our tumor organoid model recapitulated in vivo tumor progression. Hematoxylin and eosin (H+E) stained sections of organoids demonstrated that Y-CRE control organoids maintained normal nuclei, whereas the nuclei of KY-CRE and KPY-CRE cells became enlarged and abnormal with giant multinucleated cancer cells in the KPY organoids (Figure 2C). This observation is reminiscent of documented in vivo tumor cell phenotypes in the KrasLSL-G12D/+ and KrasLSL-G12D/+; p53fl/fl (KP) mouse models (Jackson et al., 2001, 2005).

Next, we interrogated the effect of KRASG12D on proliferation. On day 7 of organoid culture there was no significant difference in the percentage of KI67+ cells per organoid between Y-CRE control and KY-CRE while there was a 1.3 fold and 1.6 fold increase in the KPY-CRE organoids compared to Y-CRE control and KY-CRE, respectively (Figure 2D and S2A). On day 14 most of the Y-CRE control organoids stained negative for KI67, while both KY-CRE and KPY-CRE organoids still contained cells that stained positive for KI67 (Figure 2E and S2B). Thus, organoids from all three genotypes contained a high number of proliferating cells on day 7, but while most of the cells in the control organoids had stopped proliferating by day 14, cells in KY and KPY organoids continued to proliferate.

To test if the KRASG12D expressing organoids form tumors in vivo, we performed orthotopic transplantation assays. We transplanted single-cell suspensions from Y-CRE control, KY-CRE, and KPY-CRE organoids into the lungs of bleomycin injured mice (n=4, n=6, n=4, respectively). After 4 weeks we evaluated tumor formation by histology. The lungs of Y-CRE control transplanted mice did not show any signs of aberrant epithelial cell growth or tumor formation (Figure 2F). In contrast, in the KY-CRE and KPY-CRE transplanted lungs we found tumors that contained cells with pleomorphic features, and giant cancer cells in the KPY-CRE transplanted lungs, comparable to observations in the organoid cultures (Figure 2G and 2H). Immunofluorescence (IF) staining for YFP confirmed that these tumor lesions contained the transplanted cells (Figure S2C and S2D). Hence, cells derived from our in vitro induced tumor organoids formed tumors within 4 weeks, dramatically reducing the time required to model lung cancer in vivo compared to traditional GEMMs.

KRASG12D activated cells in organoids lose AT2 differentiation markers and express developmental lung markers

To further investigate transcriptional changes following KRASG12D activation, we performed RNA-Seq on cells from our organoid cultures. KY- and KPY-derived AT2 cells received either Ad5-CMV-Empty virus (Emp, control), no virus (no virus, control), or CRE (Figure 3A). Because we sought to reveal transcriptional changes that follow KRASG12D activation and not proliferation, we analyzed the organoids on day 7 of organoid culture, when proliferation was observed in all organoid types. After 7 days in culture, single cell suspensions were enriched for epithelial cells by FACS sorting for EPCAM+ cells (Figure S3A). 87% +/− 7% and 95% +/− 2% of the EPCAM+ cells of the KY-CRE and KPY-CRE samples, respectively, were YFP+, further confirming the high efficiency of the in vitro Cre induction. Next, we performed RNA-Seq on the EPCAM+ cells. Sample-sample-correlation analysis revealed that all control samples were highly correlated while the KY-CRE and KPY-CRE samples had high correlation and were transcriptionally distinct from the controls (Figure S3B). To perform DE analysis, we compared the CRE samples to their respective Emp controls (KY-Dif and KPY-Dif, table S3). To determine genes that were altered by KRASG12D expression, we compared KY-Dif to KPY-Dif and found 1206 genes that were shared upregulated and 1464 genes that were shared downregulated (Figure S3C; table S3).

Figure 3. KRASG12D activated cells in organoids lose AT2 differentiation markers and express developmental lung markers. See also Figure S3.

Figure 3.

(A) Experimental strategy to grow air liquid interphase (ALI) organoid cultures to perform RNA-Seq.

(B) Venn diagram showing the overlap of the top 100 differentially expressed genes in KY-CRE and KPY-CRE compared to their respective -Emp controls.

(C) Log2 fold change expression of selected genes compared to their control from RNA-Seq results.

(D) Representative pictures of IF staining on day 7 of organoid culture. Scale bar = 100 μm.

(E) Quantification of SPC+ cells per organoid on day 7 of organoid culture. Each dot represents one organoid.

(F) Representative pictures of IF staining on day 7 of organoid culture. Scale bar = 25 μm.

P-values were determined using the Mann-Whitney rank test. n.s.=p≥0.05, ***=p<0.0005

Because we saw a downregulation of AT2 differentiation genes in our GEMM data, we investigated the expression of known AT2 markers and lung developmental genes. When we compared the top 100 up- and downregulated genes in our RNA-Seq data, we found that Cd74 and Lyz2, two AT2 marker genes, were amongst the top shared downregulated genes (Figure 3B, 3C). Conversely, the developmental genes Hmga2 and Sox9 were both upregulated. Furthermore, we found increased expression of Ly6a (SCA1), consistent with our findings in vivo (Figure S1H, 3B, 3C). Moreover, we found that other known AT2 markers, Sftpc (SPC), Sftpd, and Nkx2–1, were significantly downregulated in organoids from both genotypes (Figure 3C).

Next, we investigated if these changes also occurred at the protein level. IF staining for SPC showed that the percentage of SPC+ cells per organoid decreased 6.7-fold in KY-CRE and 20-fold in KPY-CRE compared to Y-CRE control organoids on day 7 (Figure 3D, 3E). On day 14 there was a 1.1-fold decrease in KY-CRE and a 1.6-fold decrease in KPY-CRE compared to Y-control organoids (Figure S3E, S3F). Furthermore, staining for the lung epithelial marker NKX2–1 and the developmental marker HMGA2 was negatively correlated; individual cells that gained HMGA2 expression had reduced levels of NKX2–1 (Figure 3F). Thus, we demonstrated that the transcriptional downregulation of AT2 markers and the upregulation of developmental markers correlated with altered expression of the respective proteins.

KRASG12D expressing organoid cells are transcriptionally distinct and transition to a developmental-like state

To further characterized our KY-CRE organoids we performed scRNA-Seq. As before, we characterized day 7 EPCAM+ cells from KY-CRE and KY-Emp organoids (Figure 4A, S4A). After filtering and preprocessing the data, we identified three clusters C0org, C1org, and C2org (Figure 4B, S4B, S4C). C1org was composed mostly of KY-Emp cells, representing the control cluster, while C0org and C2org mostly contained KY-CRE cells (Figure 4B, 4C, 4D). Correlation analysis revealed that all three clusters were distinct and that C0org and C1org were negatively correlated (Figure S4D). As with our GEMM data, we checked the expression of previously published gene signatures upregulated in NSCLC. As expected, the KRAS activation signature was upregulated in C0org and C2org compared to control cluster C1org (Figure 4E; table S1). The NF-kappaB activation signature was lower in C2org and higher in C0org compared to C1org, indicating that only one of the Cre clusters has upregulated NF-kappaB signaling (Figure 4F; table S1). Interestingly, the proliferation signature was only elevated in C2org and not in C0org, indicating that only one of the Cre clusters has a higher proliferation signature than the control, despite high Kras activation signatures in both clusters (Figure 4G; table S1).

Figure 4. KRASG12D expressing organoid cells are transcriptionally distinct and transition to a developmental-like state. See also Figure S4.

Figure 4.

(A) Experimental strategy to grow air liquid interphase (ALI) organoid cultures followed by scRNA-Seq.

(B)(C) Clustering of transcriptomes using UMAP. Cells are colored based on (B) Louvain clusters or (C) Batch ID.

(D) Batch contributions to each Louvain cluster with number of cells indicated.

(E)(F)(G)(H) Z-scores of indicated signatures in each Louvain cluster. Dashed line marks marks median of reference sample.

(I)(J) Log2 expression of indicated genes. Dashed line marks median expression of the reference sample.

(K) Z-score of indicated signature in each Louvain cluster. Dashed line marks median of reference sample. (L) RNA velocity analysis of KRASG12D organoid scRNA-Seq dataset. Louvain clusters are shown on the left. Sox9 expression is visualized on the right.

P-values were determined using a Mann-Whitney rank test *** = p-value > 0.001, ** = p-value > 0.01.

Next, we performed DE analysis followed by identification of TF/TFCs (Figure S4E, S4F; table S1 and S4). Similar to our GEMM data, control C1org had elevated expression of Etv5, providing additional evidence for loss of AT2 transcriptional identity. One TF highly expressed in both Cre clusters compared to the control was Foxq1, and C2org had high expression of Id1, two TFs we had also detected in our GEMM. Interestingly, C0org had high expression of the lung development TF Sox9, confirming the observations in our RNA-Seq analysis (Figure 3C). In the same cluster, Smad7 and Trp53, indicative of Tgfb and p53 signaling, respectively, were also upregulated.

In agreement with our RNA-Seq and IF results, we observed a reduced AT2 signature in the two KY-CRE clusters C0ORG and C2org, similar to our GEMM data (Figure 4H, 1I). Consistent with that, the AT2 markers Lyz2 and Sftpc, and the lung identity TF Nkx2–1 had reduced expression (Figure 4I). In contrast, the lung development genes Hmga2 and Sox9 were upregulated in both KY-CRE clusters (Figure 4I, 4J) (Kim et al., 2005; Liu et al., 2019; Salwig et al., 2019; Singh et al., 2014). Furthermore, we found that a Sox9 target gene signature was upregulated in C0ORG, suggesting that Sox9 is both highly expressed and active in this cluster (Figure 4K; table S1). Next, we tested if Sox9 is also upregulated in our YFP+ cluster in our GEMM. Strikingly, both Sox9 and Sox9 target activation signature were significantly upregulated in the YFP+ C1 cluster compared to the YFP C0 cluster (Figure S4G, S4H). Notably, the changes in the GEMM model were much more subtle and the expression levels lower compared to the organoid data.

We wondered if the two KY-CRE clusters represent two different stages in cancer cell progression and if there is a transition from one cluster to the other. To address this question we analyzed the organoid and GEMM scRNA-Seq datasets using RNA velocity, a computational pipeline that infers expression dynamics and directionality based on RNA splicing (La Manno et al., 2018). In the organoid data, RNA velocity indicated that KRASG12D expressing AT2 cells transition from Sox9LOW to Sox9HIGH cells (Figure 4L). In contrast, while the C1 GEMM cluster shows a clear direction of transition, it is not solely directed towards Sox9+ cells (Figure S4I). This observed difference might be due to the significantly lower expression levels of Sox9 in the GEMM.

Next, we tested if the cells expressed differentiation markers of other cell types as observed in our GEMM data (Figure S1H). As expected, the two Cre clusters C2org and C0ORG had lower expression of the AT2 markers Sftpc, Lyz2, and Etv5, consistent with a loss of AT2 identity (Figure S4J). In contrast to our GEMM data, the AT1 marker Aqp5 had higher expression in the Cre clusters, while Pdpn expression was elevated in the control cluster. Furthermore, some cells in the Cre clusters had high expression of the ciliated cell markers Cd24a and Foxj1, and the progenitor marker Ly6a (SCA1). Both SCA1 and CD24 mark tumor propagating cells in the KP mouse model (Lau et al., 2014). The club cell markers Scgb1a1 and Scgb3a2 were upregulated in some of the Cre expressing cells, similar to our observations in the GEMM.

Lastly, GO enrichment analysis was performed to identify unique pathways for each of the KY organoid clusters (Figure S4K; table S4). Pathways enriched in C0ORG included “Regulation of I-kappaB kinase/NF-kappaB signaling”, consistent with the increased NF-kappaB signature (Figure 4F), and “ERBB signaling”, demonstrated to facilitate KRASG12D lung tumorigenesis (Kruspig et al., 2018). C2ORG was enriched for pathways related to translation, mRNA processing, and G1/S transition, potentially connected to the increased proliferation signature identified in this cluster (Figure 4G). Control C1ORG, much like YFP AT2 cells in our GEMM scRNA-Seq dataset (table S2), was enriched for cholesterol, alcohol, and lipid metabolism pathways (table S4).

Overall, we found many similarities between our GEMM and in vitro induced tumor organoid system. Most notably, we found that AT2 lineage genes are downregulated and developmental and progenitor genes are upregulated in both models, providing evidence that loss of differentiation occurs during early-stage LUAD.

Human iAT2s downregulate differentiation and maturation markers and upregulate progenitor markers upon KRASG12D expression

In order to test if the loss of AT2 differentiation markers early after KRASG12D induction can also be observed in human cells, we engineered an iPSC line to allow doxycycline (dox) regulated activation of KRASG12D in iPSC-derived AT2 cells (iAT2s). Using the iPSC line BU3 NGST (Jacob et al., 2017), which includes GFP and tdTomato reporters targeted to the endogenous NKX2–1 and SFTPC loci, respectively, we integrated the KRASG12D cassette together with a dox-inducible promoter into the “safe harbor” AAVS1 locus (Figure 5A) (Tiyaboonchai et al., 2014). Next, we differentiated the iPSCs into NKX2–1+ lung epithelial progenitors, sorted for NKX2–1GFP+ cells by FACS, and generated distal lung alveolospheres using our lung-directed differentiation protocol (Figure 5B) (Jacob et al., 2019). To test the dox inducible KRASG12D construct, we treated the alveolospheres with control vehicle (DMSO) or dox and performed deep proteomic and phophoproteomic analysis (n=4 replicates per condition; Figure 5SA, 5SB; table S5). As expected, we observed an upregulation of KRAS protein in the dox treated cells (Figure 5C), and increased phosphorylation of KRAS targets such as MAPK1, RPS6KA1, and MAPK3 (Figure 5C). Gene set enrichment analysis (GSEA) revealed RAS signaling as the top enriched pathway in the dox treated iAT2s (Figure 5D). Therefore, our proteomics and phosphoproteomics analyses confirmed that iAT2 KRASG12D cells upregulated KRAS and components of the RAS/MAPK signaling pathway upon dox treatment, indicating successful dox regulated functional activation of KRAS in the human iAT2 in vitro model system.

Figure 5. Human iAT2s downregulate differentiation and maturation markers and upregulate progenitor markers upon KRASG12D expression. See also Figure S5.

Figure 5.

(A) Schematic of AAVS1 locus with integrated dox inducible KRASG12D.

(B) Experimental strategy and timeline to grow and analyze KRASG12D inducible iAT2. DOX=doxycycline (1μg/ml), pi, p2, p3 = passage 1, 2, 3.

(C) Volcano plots indicating differential protein (left) and phosphoprotein (right) expression between dox induced and control iAT2s.

(D) Top 10 upregulated pathways in dox induced compared to control iAT2s based on phosphoproteomics analysis.

(E) FACS analysis of iAT2s over three passages following the initiation of dox vs. control vehicle (DMSO) treatment. Mean fluorescence intensity (MFI) of tdTomato is indicated.

(F) Log expression of indicated genes. Log expression of indicated genes. P-values were determined using the MAST single-cell test. *p<0.05.

(G) Log expression of indicated gene signatures. P-values were determined using a Welch Two Sample t-test. *p<0.05.

(H) Log expression of indicated genes. P-values were determined using the MAST single-cell test. *p<0.05.

To assess the downstream consequences of this signaling in iAT2s, we sorted pure NKX2–1GFP+ SFTPCtdTomato double positive cells and treated them with dox or DMSO (Figure 5B). After 2 weeks of treatment flow analysis revealed that while the majority of cells maintained NKX2–1GFP expression in both conditions, there was a reduction of SFTPCtdTomato expression frequency and intensity in the dox condition, with was sustained through multiple passages (Figure 5E).

To better understand the loss of SFTPC, we performed scRNA-Seq (Figure 5B). Using the 10X Chromium platform we profiled the transcriptomes of 775 DMSO and 1322 dox treated cells and performed DE analysis (table S5). Unbiased analysis of all cells revealed 3 cell clusters with control iAT2s grouped as a single cluster (Figure S5C, S5D). DE analysis showed significant upregulation of KRAS in both dox-treated clusters, one of which also exhibited significant upregulation of proliferation markers (e.g. MKI67, TOP2A, and CDK1) (Figure 5F, S5E). In contrast, multiple AT2 genes were significantly upregulated in the control cluster (e.g. LPCAT1, SFTPB, SFTPC, CRLF1, CTSH, SLC34A2, NAPSA, and PGC) (Figure 5F, S5E). Consistent with this observation, previously published iAT2 differentiation (SFTPB, SFTPC, SFTPD, CLDN18, LAMP3, SLC34A2, IL8, NAPSA) and maturation (SFTPA1, SFTPA2, PGC, CXCL5, SLPI) gene signatures (Hurley et al., 2020), and 20 AT2 markers shared between mouse and human from the Panglao database (table S1) were significantly downregulated in dox-treated iAT2s (Figure 5G, S5F), as was the TF ETV5, which we had also identified in our GEMM and murine organoid data (Figure S5G). Moreover, the TFs FOXQ1 and ID1 were upregulated, together with the developmental and progenitor genes SOX9 and ETV4, which is also consistent with our murine data (Figure 5H, S5G). An additional notable upregulated transcript in dox-treated iAT2s was TM4SF1, recently reported as an alveolar epithelial progenitor cell marker, enriched in Wnt responsive cells during regeneration in vivo (Zacharias et al., 2018) (Figure 5H). In keeping with an increased Wnt response, the Wnt target gene LEF1 (McCauley et al., 2017; Zacharias et al., 2018) was upregulated in dox-exposed cells (Figure S5G). As indicated by our FACS results, NKX2–1 was still expressed by our KRASG12D expressing cells, but slightly downregulated, consistent with our RNA-Seq data and IF staining in the murine organoids (Figure 3C, 3F, S5G).

Taken together, our human iAT2s results indicated that KRASG12D results in downregulation of iAT2 differentiation and maturation markers and upregulation of progenitor and developmental markers, corroborating the results from our GEMM and murine organoid model.

Differentiation and maturation markers are downregulated in AT2 cells from human early stage LUAD

To assess if the loss of AT2 identity observed in our GEMM, murine organoid, and human iAT2 models also occurs in lung cancer patients, we performed scRNA-Seq of LUAD specimens with activating KRAS mutations and associated distal normal lung tissues (>2cm from the tumor) from two stage IA LUAD patients (Figure 6A). Unsupervised clustering of non-immune cells identified epithelial (EPCAM+), fibroblast (COL1A1+), and endothelial (PECAM1+) cell clusters (Figure 6B, 6E, S6A). The epithelial cells were further divided into AT1 (PDPN+), club (SCGB1A1+), ciliated (FOXJ1+), and two distinct AT2 clusters, one comprised of AT2 cells from normal lung tissue, and the second one from LUAD (Figures 6C, 6D, 6E). AT2 cells from normal lung were characterized by high SFTPB and SFTPD expression, whereas AT2 cells from stage 1A LUAD had decreased SFTPD expression (Figure 6E). Interestingly, AT2 cells were the only epithelial cell type that formed distinct clusters in LUAD and associated normal lung tissues, while other cell types aggregated together regardless of their origin (Figure 6B, 6C), consistent with our observations in the KRASG12D GEMM (Figures 1B, 1C). Next, we checked the expression of 20 AT2 markers shared between mouse and human from the Panglao database (table S1) in AT2 cells from LUAD patients. All 20 markers were highly expressed in normal lung and LUAD AT2 cells, but not in the other cell types, thus confirming that AT2 cell cluster annotation was appropriate in both normal lung and LUAD (Figure 6F, S6B, table S6). However, AT2 cells from stage IA LUAD expressed reduced levels of these markers compared to AT2 cells from normal lung, which was consistent with our findings in GEMM, murine organoid, and human iAT2 model systems (Figure 6F, S6B). To our knowledge, this is the first documentation of loss of AT2 identity in human early stage LUAD patient samples.

Figure 6. Differentiation and maturation markers are downregulated in AT2 cells from human early stage LUAD. See also Figure S6.

Figure 6.

(A) Experimental strategy to obtain cells from human early stage IA LUAD for scRNA-Seq.

(B)+(C) Louvain clustering of transcriptomes of non-immune LUAD cell types and matching normal lung tissue. Cells are colored based on (B) Louvain clusters or (C) Sample ID.

(D) Batch contributions to each Louvain cluster shown in 6B with number of cells indicated.

(E) Violin plots showing gene expression values of selected genes in annotated clusters shown in 6B.

(F) Z-score of gene signature comprised of AT2 signature genes shared between mouse and human from the Panglao database. Dashed line marks y=0.

(G) Transcriptional comparison of KRAS LUAD models. Correlation heatmap of individual cells of the organoid scRNA-Seq data (x-axis) and z-normalized gene signatures (y-axis). Cells are ordered based on correlation distance calculation. Louvain clusters are annotated.

Comparison of GEMM, murine and human organoid models, and patient early-stage LUAD datasets

Comparison of the transcriptional profiles in the model systems showed that our murine and human organoid systems recapitulated transcriptional changes in the GEMM and in early-stage lung cancer patients, revealing a shared downregulation of alveolar differentiation markers. To further compare all four scRNA-Seq datasets to each other, we calculated z-scores for each cell in our murine KrasG12D organoid dataset using gene signatures derived from our DE analysis and previously published signatures. As expected, control AT2 cells from the GEMM correlated with control murine organoid AT2 cells; organoid control AT2 cluster C1 was most similar to the AT2 YFP cluster signature from the GEMM model (Figure 6G). Furthermore, murine organoid cells with oncogenic KRAS and GEMM cells with oncogenic KRAS were transcriptionally similar; murine organoid C0 and C2 correlated with the AT2 YFP+ GEMM signatures. Both the human KRAS iAT2 and the KRAS mutant patient datasets were similar to the murine KRASG12D organoid cells, with both iAT2 and patient cells most closely resembling C0. Moreover, we found that our murine organoids correlated with the gene expression signature of lung cancer progression (“LUAD progression”) from a report which used the KrasG12D GEMM (Neidler et al., 2019). The organoid datasets also correlated with the HALLMARK_WNT signature (Broad MSigDB), demonstrating how our organoids recapitulate the GEMM, as the Wnt pathway was shown to play an important role in lung cancer progression (Tammela et al., 2017).

Taken together, our resource provides omics analyses from three models and patient-derived KRAS-driven LUAD at its earliest stages. Our data suggests that reduction of the mature AT2 transcription program is an important early step in KRAS-driven LUAD initiation. Furthermore, we demonstrated in GEMM and human stage IA LUAD patients that only AT2 cells transition to a transcriptionally distinct state during the early stage of KRAS tumorigenesis. Additionally, the in vitro induced human and murine organoid systems, which recapitulate core components of early stage LUAD progression, provide rapid and easily perturbed models for investigation of lung cancer biology.

Discussion

In our studies, we show that developmental gene signatures are present in early-stage, non-metastasizing LUAD, indicating that alveolar cells lose differentiation markers early after activation of oncogenic KRAS. To our knowledge, this is the first time it has been shown that loss of differentiation occurs in early-stage LUAD. It is a long-held notion that tumor cells hijack developmental programs. However, this process has been thought to occur in late-stage, metastasizing tumors (Kulesa et al., 2013; Nieto, 2013; Thiery, 2002; Yang and Weinberg, 2008). In humans, SOX9 protein levels are correlated with a higher NSCLC tumor stage and worse survival (Jiang et al., 2010; Zhou et al., 2012). In mouse models, primary tumors that have metastasized contain cells that have lost NKX2–1 and express HMGA2 (Winslow et al., 2011). Sox9, Nkx2–1, and Hmga2 are all genes with important functions during embryonic lung development (Alanis et al., 2014; Maeda et al., 2007; Singh et al., 2014). SOX9 has been shown to work together with KRAS in lung development to maintain a balance between branching morphogenesis and alveolar differentiation (Chang et al., 2013). While NKX2–1 is still present in adult lung epithelial cells, SOX9 and HMGA2 are not found in healthy adult lung epithelium (Nikolić et al., 2017; Pfannkuche et al., 2009). Taken together, our organoid systems can be used to identify transcriptional states in cells bearing oncogenic KRAS that distinguishes them from their normal adult epithelial counterparts, shedding light on new ways to intervene in lung cancer progression.

Our findings present a murine organoid system that can be used as a tool to study tumor initiation and progression in a controlled environment. We directly compared our KY AT2-derived organoids to AT2 cells with activated KRASG12D in vivo at an early-stage time point. We observed corresponding transcriptional changes in day 7 organoids and in vivo cells 7 weeks after induction. Therefore, we hypothesize that the tumor organoids recapitulate LUAD progression in an accelerated manner. Furthermore, our transplantation studies showed that the KRAS tumor organoids can be orthotopically transplanted. Therefore, our organoid system can be used for in vitro manipulation and subsequent transplantation, to facilitate the study of potential therapeutic targets on lung cancer development and progression. This creates an exciting opportunity to model lung cancer tumorigenesis on an accelerated time scale while maintaining the core transcriptional signatures that appear during tumor progression, in a manner that is compatible with genetic or chemical perturbations prior to transplantation.

Our murine organoid data shows remarkable similarities to our GEMM and our human datasets. However, there are also differences that we found. Some of the cells from the organoids have high expression levels of Sox9 and Hmga2, and stain positive for HMGA2, while the transcriptional upregulation of Sox9 and its targets in our GEMM data is rather modest. Furthermore, we see a strong downregulation of the AT2 signature in murine and human organoids with almost complete loss of SPC expression in the murine organoids, while the downregulation of the signature is more subtle in our GEMM data. One explanation is that the organoids are in a state of unrestrained proliferation. Therefore, it is conceivable that our organoids progress fast, while cells in the GEMM model receive inhibiting cues from the microenvironment or are being cleared by the immune system. Indeed, it is difficult to compare the timelines of the organoids to the timeline of tumor progression in vivo. Nevertheless, because of its defined and easy to manipulate culturing conditions, we think that the organoids are an advantageous system to study the direct effect that KRASG12D expression has on AT2 cells.

Together, our work provides murine and human organoid systems to study LUAD progression rapidly in vitro. We analyzed our murine organoids, human iAT2s organoids, KRASG12D GEMM, and stage IA patient data and provided these datasets to the research community. Our comparison of the single cell datasets revealed a common loss of AT2 identity as an early occurring event following KRAS pathway activation in all four contexts. These comparisons also revealed the utility of our murine tumor organoid system in modeling human lung cancer driven by KRAS mutagenesis in its earliest stage. Bulk RNA-Seq, proteomics, and phosphoproteomics validate our findings in the single cell datasets and are an additional resource for data mining. Our data may be a useful component of cancer atlas projects and screening candidate drug targets to prevent progression of early stage LUAD in KRAS mutant patients followed by proof of principle testing. Additionally, the organoid tools described could have utility in the cancer modeling field and drug screening.

Limitations of study

While the observed loss of alveolar identity markers was validated on protein level in our organoid cultures, we only presented evidence for downregulation of these markers in our GEMM and stage IA patient data on a transcriptional level. In future work, we will examine GEMM and patient samples by immunohistochemistry to confirm the changes in AT2 cell marker expression and altered expression of TFs and their targets. Further studies are also required to determine if decreased NKX2–1 expression is an important early consequence of expression of oncogenic KRAS in human cells. In a related manner, there are multiple upstream signaling pathways connected to genes in our analysis that could cause the loss of AT2 cell differentiation phenotype, including Sox9, Wnt, and Nkx2–1. Determining the role of AT2 lineage identity and other observed transcriptional changes on LUAD progression will be important. In-depth analysis of lineage plasticity and assessment of the transcriptomic, proteomic, and functional heterogeneity between cells expressing oncogenic KRAS in early-stage LUAD will be another interesting topic of future work. Finally, while our organoid models of oncogenic KRAS activation provide rapid ways to identify possible therapeutic avenues for early-stage LUAD, we have not yet validated a new therapeutic lead generated from our data.

STAR Methods

RESOURCE AVAILABILITY

Lead Contact

Further information and requests for resources and reagents should be directed to the Lead Contact, Carla F. Kim (carla.kim@childrens.harvard.edu).

Materials Availability

Pluripotent stem cell lines generated in this study are available from the CReM Biobank at Boston University and Boston Medical Center and can be found at http://www.bumc.bu.edu/stemcells.

Data and code availability

Raw and processed single-cell and bulk RNA-seq data were deposited to the NCBI Gene Expression Omnibus (GEO) and Sequencing Read Archive (SRA) under the following accession codes:

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the Proteomics Identifications (PRIDE) partner repository (Deutsch et al., 2020; Perez-Riverol et al., 2019) with the dataset identifier PXD019240.

EXPERIMENTAL MODEL AND SUBJECT DETAILS

Mouse cohorts

KrasLSL-G12DAWT (Jackson et al., 2001) and KrasLSL-G12D/WT;p53flox/flox (Jackson et al., 2005) mice were crossed to Rosa26LSL-eYFP mice to obtain KrasLSL-G12D/WT; Rosa26LSL-eYFP (KY) and KrasLSL-G12D/WT;p53flox/flox; Rosa26LSL-eYFP (KPY) mice. Rosa26LSL-eYFP (Y) control mice were littermates of the KY mice. Mice were maintained in virus-free conditions. All mouse experiments were approved by the BCH Animal Care and Use Committee, accredited by AAALAC, and were performed in accordance with relevant institutional and national guidelines and regulations.

Stage IA LUAD patient information

Samples of two patients with the diagnosis stage IA LUAD were analyzed in these studies. One patient was female, 74 years old, with a KRAS-G12F mutation identified as driver mutation. The other patient was female, 77 years old, with a KRAS-G12V mutation identified as driver mutation. All patients provided written informed consent. The studies were approved by the UCLA institutional review board.

METHOD DETAILS

Mouse studies

In vivo adenovirus infection

8-week-old mice were infected with 2.5×107 PFU adenovirus by intratracheal instillation as described previously (DuPage et al., 2009). A 1:1 ratio of male and female mice was used.

Lung preparation and FACS

Mice were anesthetized with avertin, perfused with 10 ml PBS, followed by intratracheal instillation of 2 ml dispase (Corning). Lungs were iced, minced and incubated in 0.0025% DNAse (Sigma Aldrich) and 100 mg/ml collagenase/dispase (Roche) in PBS for 45 min at 37°C, filtered through 100 μm and 40 μm cell strainers (Fisher Scientific), and centrifuged at 1000 rpm, 5 min at 4°C. Cells were resuspended in red blood cell lysis buffer (0.15 M NH4Cl, 10mM KHCO3, 0.1 mM EDTA) for 1.5 min, washed with advanced DMEM (Gibco), and resuspended in PBS/10% FBS (PF10) at 1 million/100 μl. Depending on the experiment, cells were incubated for 10 min on ice with DAPI as a viability dye and the following antibodies: anti-CD31 APC, anti-CD45 APC, anti-Ly-6A/E (SCA1) APC/Cy7 (all Thermo Fisher Scientific), anti-CD326 (EP-CAM) PE/Cy7 (Biolegend) (all 1:100). Single stain controls and fluorophore minus one (FMO) controls were included for each experiment. FACS was performed on a FACSAria II and analysis was done with FlowJo.

In vitro virus infection and organoid culture

Murine lung CD31-CD45-EPCAM+ SCA1-cells isolated by FACS as described in section “Lung preparation and FACS” were split into 2 or 3 equal aliquots, or not split, depending on the experiment, pelleted by pulse spin and resuspended in 100 μl MTEC/Plus media (Zhang et al., 2017a) containing 6 × 107 PFU/ml of Ad5CMV-Cre, Ad5CMV-Empty, or no virus in 100 μl per 100,000 cells. The cells were incubated for 1 h at 37°C, 5% CO2 in 1.5 ml tubes. Cells were then pelleted by pulse spin and resuspended in 1x phosphate-buffered saline (PBS). This step was repeated twice for a total of three washing steps. Cells were resuspended in Dulbecco’s Modified Eagle’s Medium/F12 (Invitrogen) supplemented with 10% FBS, penicillin/streptomycin, 1 mM HEPES, and insulin/transferrin/selenium (Corning) (3D media) at a concentration of 5,000 live cells (trypan blue negative) per 50 μl. As supporting cells, a mix of neonatal stromal cells was isolated as described elsewhere (Lee et al., 2014). The stromal cells were pelleted and resuspended in growth factor reduced (GFR) Matrigel at a concentration of 50,000 cells per 50 μl. Equal volumes of cells in 3D media and supporting cells in GFR Matrigel were mixed and 100 μl were pipetted into a Transwell (Corning). Plates were incubated for 20 min at 37°C, 5% CO2 until Matrigel solidified. Finally, 500 μl of 3D media was added to the bottom of the well. 3D media was changed every other day.

Staining and IF of organoid cultures

To image whole wells, multiple, overlapping images of live organoid cultures were taken and stitched together using AutoStich software. To prepare organoid slides, organoid cultures were fixed with 10% neutral-buffered formalin overnight at room temperature. After rinsing with 70% ethanol, the organoid cultures containing Matrigel plug was immobilized with Histogel (Thermo Scientific) for paraffin embedding. Paraffin blocks were cut into 5 μm sections and adhered to glass slides. For deparaffinization, slides were incubated in xylene and then rehydrated in 100%, 95%, 70% ethanol successively. Slides were then stained with haematoxylin and eosin, or further processed for IF staining. For IF staining, antigen was retrieved by incubating the slides in citric acid buffer (pH 6) at 95°C for 20 min. After washing slides with PBS containing 0.2% Triton-X (PBS-T) and blocking with 10% normal donkey serum for 1 h at room temperature, slides were incubated with antibodies for Ki67 (EBioscience 1:100), YFP (Abcam, 1:400), SPC (Abcam, 1:1,000), Nkx2–1 (Abcam, 1:250) Hmga2 (GeneTex, 1:200), in a humidified chamber at 4°C overnight. Secondary antibodies were added following three washing steps with PBS-T and included donkey anti-rat Alexa 594, donkey anti-goat Alexa 488/647, donkey anti-rabbit Alexa 488/594, donkey anti-mouse Alexa 647 (all Invitrogen, 1:200). Slides were mounted using Prolong Gold with DAPI (Invitrogen).

Preparing single cell suspensions of organoid cultures

At day 7 of organoid culture, 100 μl dispase (Fisher Scientific) was added to the transwells on top of the Matrigel and incubated for 1 h at 37°C, 5% CO2. After digestion of Matrigel, the wells were washed with PBS and the organoids were pipetted into 15 ml conical tubes. The tubes were filled with PBS to dilute the remaining Matrigel and dispase. After pelleting the organoids at 300 g for 5 min, the organoids were resuspended in 37°C warm Trypsin EDTA (0.25%, Invitrogen) and incubated for 7–10 min at room temperature to obtain a single cell suspension. Trypsin was quenched by adding PBS + 10% FBS (PF10).

Transplantation assays of organoids

To ensure engraftment, 8–10 weeks old Athymic Nude mice were injured by injecting 1.5U/kg bleomycin intratracheally one day before transplantation. For transplantation assays, single cell suspensions were obtained from day 14–21 of passage 0 organoid cultures as described in section “Preparing single cell suspensions of organoid cultures”. To ensure transplantation of equal numbers of Cre-activated cells across samples, YFP+ cells were counted under the fluorescence microscope and 33,000–130,000 YFP+ cells resuspended in 45 μl PBS were administered into the lungs of the injured Athymic Nude intratracheally. For histology evaluation, mice were sacrificed after 4 weeks and lungs were fixed by injecting 10% neutral-buffered formalin into the lungs through the trachea.

FACS to prepare organoid cultures for RNA-Seq

Single cell suspensions were obtained from day 7 organoid cultures as described in section “Preparing single cell suspensions of organoid cultures”. For FACS staining, cells were incubated with EPCAM-PeCy7 (BioLegend) and DAPI (Sigma-Aldrich) for 10 min on ice. A DAPI only control served as the fluorophore minus one (FMO) control for EPCAM. FACS was performed on a FACSAria II and analysis was done with FlowJo.

RNA extraction and bulk RNA-Seq of organoids

EPCAM+ cells were obtained from organoid cultures as described in section “FACS to prepare organoid cultures for RNA-Seq”. RNA was extracted using the Absolutely RNA Microprep Kit (Agilent). After RNA extraction, all downstream quality control steps, library preparation, sequencing, and differential gene expression analysis was performed by the Molecular Biology Core Facilities at Dana-Farber Cancer Institute. Complementary DNA (cDNA) was synthesized with Clontech SmartSeq v4 reagents from 2ng of RNA. Full length cDNA was fragmented to a mean size of 150bp with a Covaris M220 ultrasonicator and Illumina libraries were prepared from 2ng of sheared cDNA using Takara Thruplex DNAseq reagents according to manufacturer’s protocol. The finished double strand DNA libraries were quantified by Qubit fluorometer, Agilent TapeStation 2200, and RT-qPCR using the Kapa Biosystems library quantification kit. Uniquely indexed libraries were pooled in equimolar ratios and sequenced on an Illumina NextSeq500 run with single-end 75bp reads at the Dana-Farber Cancer Institute Molecular Biology Core Facilities.

Bioinformatic analysis of bulk RNA-Seq

Sequenced reads were aligned to the UCSC hg19 reference genome assembly and gene counts were quantified using STAR (v2.5.1b) (Dobin et al., 2013) Differential gene expression testing was performed by DESeq2 (v1.10.1) (Love et al., 2014) and normalized read counts (FPKM) were calculated using cufflinks (v2.2.1) (Trapnell et al., 2010). RNAseq analysis was performed using the VIPER snakemake pipeline (Cornwell et al., 2018).

ScRNA-Sequencing of GEMM and organoids

ScRNA-Seq was performed using the 10X Genomics platform (10X Genomics, Pleasanton, CA). FACS sorted cells from either mice or organoid cultures were encapsulated with a 10X Genomics Chromium Controller Instrument using the Chromium™ Single Cell A Chip Kit. Encapsulation, reverse transcription, cDNA amplification, and library preparation reagents are from the Chromium™ Single Cell 3’ Library & Gel Bead Kit v2. Briefly, single cells were resuspended in PF10 at a concentration of 1000 cells μ−1. The protocol was performed as per 10X Genomics protocols without modification (chromium single cell 3 reagent kits user guide v2 chemistry). Total cDNA and cDNA quality following amplification and clean-up was determined using a QubitTM dsDNA HS assay kit and the Agilent TapeStation High Sensitivity D5000 ScreenTape System. Library quality pre-sequencing was determined using Agilent TapeStation and QPCR prior to sequencing. TapeStation analysis and library QPCR was performed by the Biopolymers Facility at Harvard Medical School. Libraries were sequenced using an Illumina NextSeq500 using paired-end sequencing with single indexing (Read 1 = 26 cycles, Index (i7) = 8 cycles, and Read 2 = 98 cycles). Reads were aligned to the mm10 reference genome and count matrices were generated using CellRanger3.0.0 (10X Genomics).

Bioinformatics for GEMM and organoid scRNA-Seq

Count matrices generated by CellRanger3.0.0 were read into the Python single cell analysis environment Scanpy (v 1.4.4) (Wolf et al., 2018). In brief, cells with >10% mitochondrial content, which correlated with low read count, were removed. The data was normalized, logarithmized, and the significant number of principle components determined using in-built Scanpy functions. Data was de-noised using Markov Affinity-based Graph Imputation (v 1.5.5) using the following settings (Gene to return = all, k=3, t=3, n_pca=30) (van Dijk et al., 2018). Gene Ontology enrichment analysis was performed with Enrichr (Kuleshov et al., 2016) using the GSEAPY (v 0.9.13) python wrapper. A reference list of murine transcription factors and transcription co-factors is from the Animal Transcription Factor Database (Hu et al., 2019). Lists of genes activated by specific transcription factors were from the TRRUST database (Han et al., 2018). KRAS activation signature was previously described (Barbie et al., 2009; Bild et al., 2006). Murine AT2 marker genes are from PanglaoDB (Franzén et al., 2019). All gene lists can be found in table S1. Data was visualized using in-built Scanpy plotting functions, Seaborn (v0.9.0) (https://seaborn.pydata.org/), and Matplotlib (v 3.0.2) (Hunter, 2007).

RNA Velocity

Velocyto (0.17.16) was run on the KY GEMM and KY organoid CellRanger output files using the run10X shortcut and the mm10 genome annotation file provided with the CellRanger pipeline. Loom files generated by Velocyto for each sample were concatenated into an anndata object. To visualize velocity on the original UMAP embedding a new anndata was created by merging the velocity and original anndata objects using the utils.merge() function in scVelo (0.1.25). Velocity was calculated using the merged anndata object and in-built velocity functions.

Human iPSC studies

Generation of BU3 NGST-TetOn:KRASG12D line

To generate a dox-inducible KRASG12D cassette targeted to the AAVS1 locus by gene editing, the previously published BU3 NGST human iPSC line was used (Jacob et al., 2017). PZ P 4X(cHS4) TetON-3XFLAG-tdT CAGG-m2rtTA v2, an optimized targeting vector for the AAVS1 locus was obtained as the kind gift of Laura Ordovas (Ordovás et al., 2015). This vector has the addition of two cHS4 insulators on either side of the transgene to reduce the potential for silencing. In addition, the construct contains an m2rtTA under the control of a CAG promoter and a T2A:puromycin resistance gene that should only be active when inserted near a coding sequence, improving the selection specificity. Human KRASG12D was PCR amplified from pBabe-Kras G12D, a gift from Channing Der (Addgene plasmid # 58902; http://n2t.net/addgene:58902; RRID:Addgene_58902) using primers hKRAS mutG12D PmeI and hKRAS mutG12D MluI. The resulting PCR product was cloned into PZ P 4X(cHS4) TetON-3XFLAG-tdT CAGG-m2rtTA v2 using EcoRV and MluI restriction sites to generate a new vector named AAVS1-TetOn:KRASG12D. For targeting the BU3NGST iPSC line, 4 ×106 live cells were resuspended in Amaxa™ P3 primary cell nucleofection solution containing 1ug/106 cells of the AAVS1-TetOn:KRASG12D plasmid and the left and right zinc finger plasmids targeting the AAVS1 locus. The cells were then nucleofected using the human embryonic stem cell (hESC), H9 standard program on the Lonza 4D-nucleofector™. The cells were then resuspended in mTeSR™ with 10uM Y27632 and plated on a 10cm hESC Matrigel coated plate. Cells were selected using puromycin at 500–700 ng/ml starting a minimum of 96hrs after nucleofection. Selection was maintained for 7–10 days as the resistant colonies emerged and grew. Successful colonies were manually picked into 24-well hESC Matrigel coated plates in mTeSR™ with 10uM Y27632. Genomic DNA from each clone was screened for insertion using primers Z-AV-4 (binds in the AAVS1 locus outside the donor arm)/T2A R and correct insertion validated by sequencing. Positive clones were expanded, re-selected with puromycin and frozen, and a single clone was carried forward after G-banding analysis to confirm normal 46XY karyotype.

Lung differentiation and flow cytometry

Lung differentiation of the iPSC line (BU3 NGST-TetOn:KRASG12D) into alveolar type 2 cells was performed according to the detailed protocol previously published by Jacob et al. (Jacob et al., 2017, 2019). Briefly, iPSC-derived NKX2–1GFP+ lung epithelial progenitors generated after 15 days of directed differentiation were purified by GFP+ flow cytometry sorting and replated for further distal lung/alveolar differentiation in 3D Matrigel cultures and the resulting monolayered epithelial spheres were maintained as self-renewing distal alveolar epithelial cells by serial passaging approximately every 2 weeks in serum-free, feeder-free 3D culture (“CK+DCI” media as detailed in Jacob et al., 2019). Quality and phenotype of the cultures was monitored at each passage by flow cytometry quantitation of NKX2–1GFP and SFTPCtdTomato expression as shown in the text. Detailed protocols for cell preparation for flow cytometry and analysis of these reporters has been previously published (Jacob et al., 2019).Briefly, for flow cytometry analysis, cells were resuspended in FACS buffer (PBS with 2% FBS and 10 nM calcein blue AM (ThermoFisher)) and analyzed on an S1000EXi flow cytometer (Stratedigm San Jose, CA). For cell sorting, cells were resuspended in FACS buffer plus 10 uM Y-27632 to support viability in replated cells. Live cells were sorted on a high speed cell sorter (MoFlo Legacy, Beckman Coulter) at the Boston University Medical Center Flow Cytometry Core Facility based on NKX2–1GFP expression. All differentiation and passaging protocols for iAT2s are also available for free download from the protocols webpage of www.kottonlab.com.

Proteomic and phosphoproteomic analysis

After 3 passages as NKX2–1GFP+ sorted alveolospheres, iAT2s were treated with dox (1mcg/ml) or DMSO for 15 days. Four replicates of each condition were dissociated and sorted on live, NKX2–1GFP+ cells using the previously described protocol (Jacob et al., 2017, 2019), and collected as cell pellets. In order to interrogate the proteome and phosphoproteome of +/−dox-exposed KRASG12D targeted iAT2s, the cell pellets collected were resuspended in lysis buffer composed of 6M GuHCl (guanidinium chloride), 100mM Tris pH 8, 40mM chloroacetamide, 10mM TCEP (tris(2-carboxyethyl)phosphine), and phosphatase inhibitors (PhosStop, Roche), and sonication via a Branson probe. Total protein content was quantified and equal amounts of denatured protein was allocated from each sample, diluted with 7 volumes of 100mM tris, and trypsin digested into peptides. The peptide mixtures from control iAT2s vs dox-exposed iAT2s were individually isotopically-labelled with a distinct isobaric TMT-10plex reagent. After pooling, the mixture was injected onto a reverse-phase Waters Xbridge C18 HPLC column to fractionate the multiplexed peptides, which markedly increased depth of coverage. Peptides were eluted in 12 fractions over 48 min. For the total proteome analysis, 5% of each fraction was analyzed directly by LC/MS. The remaining 95% was set aside for phospho-peptide enrichment using Fe-NTA magnetic beads (Cube Biotech) (Leutert et al., 2019), totaling 24 injections analyzed by precision mass spectrometry (LC/MS). We used the MaxQuant (1.6.7.0; http://maxquant.org/) software package for protein identification by searching with the UniProt Human database (accessed April 2019) and relative quantification of the TMT reporter labels (Cox et al., 2011). Standard search parameters included allowing for two missed trypsin cleavage sites, variable modifications of methionine oxidation, and N-terminal acetylation, and fixed modification of carbamidomethylation of cysteine residues. Protein phosphorylation at S, T, and Y residue data was included as a variable modification for the phosphoproteomic data. Ion tolerances of 20 and 4.5 ppm were set for first and second searches, respectively. After stringent filtering (peptide and protein level FDR of 1% as determined by reverse decoy search), cognate proteins were identified using strict matching parameters guided by principles of parsimony to account for all observed peptide hits. Matches were pruned by filtering out candidates supported by only a single unique peptide. For the identification of phosphopeptides, only modified peptides with unambiguous single site-localization probabilities of at least 0.7 was retained for downstream (differential and pathway enrichment) analyses. For quantitative comparisons of the samples, summed protein intensities were log transformed, LoessF normalized, and statistically significant changes determined using empirical Bayes analysis implemented in the limma package (Phipson et al., 2016) in R: A language and environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org). Gene Set Enrichment Analysis (GSEA) was performed using the fgsea packackage in R (Sergushichev, 2016).

ScRNA-Seq

Parallel wells of iAT2s (derived from BU3 NGST-TetOn:KRASG12D iPSCs, beginning at sphere passage P3) were treated with either control vehicle (DMSO) or doxycycline (dox; 1ug/mL) to induce expression of KRASG12D. After 4 more passages and 69 days of exposure to Dox or DMSO (total differentiation time= 127 days), cells were dissociated from 3D Matrigel (as described in Jacob et al., 2019), and sorted for Calcein Blue+live cells. scRNA-seq of all calcein blue-stained live cells was performed using the 10X Chromium system with v3 chemistry as previously published (McCauley et al., 2017). Library preparation and sequencing was done at the Boston University Microarray and Sequencing Resource (BUMSR) Core using an Illumina NextSeq 500 instrument.

Bioinformatics analysis of scRNA-Seq

Reads were demultiplexed and aligned to the human genome assembly (GRCh38, Ensembl) with the CellRanger pipeline v.3.0.2 (10X Genomics). Further analyses were done using Seurat v. 3.1.4 (Stuart et al., 2019). Cells with more than 25% of mitochondrial content or less than 800 detected genes were excluded from downstream analyses (leaving 775 controls and 1322 dox-treated cells). We then filtered out the non-lung endoderm population from the control sample (149 cells), leaving a total of 626 cells in the control population and 1322 cells in the dox+ population. We normalized and scaled the UMI counts using the regularized negative binomial regression (SCTransform, Hafemeister and Satija, 2019). Following the standard procedure in Seurat’s pipeline, we performed linear dimensionality reduction (principal components analysis; PCA), and used the top 20 principal components to compute both the UMAP (Diaz-Papkovich et al., 2019) and the clusters (Louvain method, Blondel et al., 2008), which were computed at a range of resolutions from 1.5 to 0.05 (more to fewer clusters). For downstream analyses, we refer to the 3 clusters identified at resolution 0.1 (Figure S5C). Cell cycle scores and classifications were done with Seurat using the method from Tirosh et al. (2016). The same method was used to calculate the enrichment in the iAT2 differentiation and maturation signatures from Hurley et al. (2020). The cut-offs for independent filtering (Bourgon et al., 2010) prior to DE testing required genes: a) being detected in at least 10% of the cells of either population and b) having a natural log fold change of at least 0.25 between populations. The tests were performed using Seurat’s wrapper for the MAST framework (Finak et al., 2015), identifying 393 differentially expressed genes between control and dox-treated cells (table S5). For a comparison of the performance of methods for single-cell DE, see Soneson and Robinson (2018). The top 20 genes upregulated and ranked by their fold-change in each clustered population with FDR < 0.05 are represented in a heatmap (Figure S5E).

Patient stage IA lung cancer studies

Sample collection and preparation for scRNA-Seq

Lung cancer resection specimens were obtained from patients with the radiographic diagnosis of stage IA lung cancer. All patients provided written informed consent. KRAS mutation status of tumors was determined by targeted sequencing. Resected tissues were placed on ice in RPMI medium immediately after resection and delivered to the lab for tissue dissociation. Dissociation was performed in RPMI medium supplemented with 10% FBS. Briefly, tissues were sliced to approximately 1 mm3 pieces and dissociated in 1 mg/ml collagenase (Sigma Aldrich, #C9407) and 1000 U/ml DNAse I (Sigma Aldrich, #D4263–1VL) at 37°C for approximately 1 hour until homogeneity followed by passing through a 40 μm strainer to remove cell aggregates and red blood cell lysis with 1 ml of ACK buffer (Sigma Aldrich, #11814389001). Cells were resuspended in 5 ml DPBS + 0.04% BSA, counted and immediately used to prepare the sequencing libraries.

ScRNA-Seq and read alignment

The 10X Genomics platform (10X Genomics, Pleasanton, CA) was utilized for assessing human single cell transcriptome. Single cell encapsulation, library construction and sequencing were performed at Technology Center for Genomics and Bioinformatics at UCLA according to the manufacturer’s protocols. The Chromium™ Single Cell 3’ Library & Gel Bead Kit v2 and v3 were used for library preparation. Libraries were sequenced utilizing Illumina NovaSeq 6000 instrument. CellRanger 3.0.0 software (10X Genomics) was utilized to align reads to human GRCh38 reference and generate count matrices.

Bioinformatics analysis

Human single cell transcriptome data was analyzed by following Seurat pipeline (Stuart et al., 2019). Poor quality cells with > 15% mitochondrial content and less than 500 detected features were filtered out. The data was normalized and batch-adjusted based on Seurat Standard workflow. Cell clustering analyses were performed on the adjusted data to first separate immune cells from non-immune cells in-silico, and then to identify lung specific cell subtypes among non-immune cells. Pseudobulk approach was utilized to identify differentially expressed genes (DEGs) in AT2 cells from tumor and the associated normal lung tissue. Patient-associated variation was included in modeling DEG using edgeR package.

QUANTIFICATION AND STATISTICAL ANALYSIS

Statistics

Statistical testing was performed using GraphPad Prism or Scipy 1.3.0 statistical functions (scipy.stats). The tests used to determine statistical significance are quoted in the appropriate figure legends. P-values are indicated in the figures, and P-values <0.05 were considered significant.

Supplementary Material

2
3

Table S1, related to figures 1, S1, 4, S4, S5, 6, and S6: TF/TCF and gene signature lists. List of TF/TCF, and genes used to create signature z-scores for KRAS, AT2 identity, Sox9 targets, RELA/NF-kappaB targets, and proliferation.

4

Table S2, related to figure 1 and S1: Differentially expressed genes and Gene Ontology (GO) analysis in scRNA-Seq data of LUAD GEMM. Top 1000 differentially upregulated genes in each cluster unfiltered and filtered for TF/TCF; n = gene name, p = p-value, l = log2-fold change. Identification of unique and common GO terms between C0 and C1 in early-stage lung cancer GEMM. All common and unique pathways are statistically significant.

5

Table S3, related to figure 3 and S3: Bulk RNA-Seq gene lists of organoids. Differentially expressed genes in KY-CRE vs. KY-Emp (KY-Dif) and KPY-CRE vs. KPY-Emp (KPY-Dif), and list of shared up- and down-regulated genes of KY-Dif compared to KPY-Dif.

6

Table S4, related to figure 4 and S4: Differentially expressed genes and Gene Ontology (GO) analysis in scRNA-Seq data of KY organoids. Top 1000 differentially upregulated genes in each cluster unfiltered and filtered for TF/TCF; n = gene name, p = p-value, l = log2-fold change. Identification of unique GO terms between all clusters. All unique pathways are statistically significant.

7

Table S5, related to figure 5 and S5: Differentially expressed genes in scRNA-Seq data of iAT2s. Proteomics and Phosphoproteomics results of iAT2s. DEG analysis and mass spectometry results of KRASG12D expressing iAT2 (dox) vs control (DMSO)

8

Table S6, related to figure 6 and S6: Differentially expressed genes in scRNA-Seq data of LUAD stage IA patient samples. DEG analysis comparing AT2 stage IA to AT2 normal.

KEY RESOURCES TABLE

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies
Rat monoclonal anti-CD45 APC [30-F11, BD] Thermo Fisher Scientific Cat#BDB559864
Rat monoclonal anti-CD31 APC [MEC 13.3, BD] Thermo Fisher Scientific Cat# BDB551262
Rat monoclonal anti-CD326 (EP-CAM) PE/Cy7 [G8.8] BioLegend RRID:AB_1236471; Cat#118216
Rat monoclonal anti-Ly-6A/E (Sca1) APC/Cy7 [D7] Thermo Fisher Scientific RRID:AB_1727552; Cat#560654
Rabbit monoclonal anti-SP-C [EPR19839] Abcam Cat#ab211326
Rat monoclonal anti-Ki67 [SolA15] Thermo Fisher Scientific RRID:AB_10854564;Cat#14–5698-82
Rabbit monoclonal anti-TTF1 (Nkx2–1) [8G7G3/1] Abcam RRID:AB_1310784; Cat#ab76013
Mouse monoclonal anti-Hmga2 [GT763] GeneTex Cat#GTX629478
Goat polyclonal anti-GFP (YFP) Abcam RRID:AB_305643;Cat#ab6673
Donkey anti-rat Alexa 594 Invitrogen RRID:AB_2535795;Cat#A-21209
Donkey anti-goat Alexa Fluor 488 Invitrogen RRID:AB_2534102; Cat#A-11055
Donkey anti-goat Alexa Fluor 647 Invitrogen RRID:AB_141844;Cat#A-21447
Donkey anti-rabbit Alexa Fluor 488 Invitrogen RRID:AB_141708;Cat#A-21206
Donkey anti-rabbit Alexa Fluor 594 Invitrogen RRID:AB_141637;Cat#A-21207
Donkey anti-mouse Alexa Fluor 647 Invitrogen RRID:AB_162542;Cat#A-31571
Mouse monoclonal antibody to human CKIT, allophycocyanin (APC) conjugated Life Technologies Cat#CD11705; RRID: AB_1463361
Mouse monoclonal IgG2a antibody against human, rhesus, cynomolgus CD184(CXCR4) Clone 12G5 Stem Cell Technologies Cat #60089PE
Mouse IgG1 isotype, APC conjugated Life Technologies Cat#MA5–18093; RRID: AB_2539476
Mouse IgG2a isotype, PE-conjugated Stem Cell Technologies Cat#60108PE
Bacterial and Virus Strains
Ad5CMVempty Viral Vector Core University of Iowa Lot:Ad4154; Cat#VVC-U of Iowa-272
Ad5CMVCre Viral Vector Core University of Iowa Lot: Ad4117; Cat#VVC-U of Iowa-5
Biological Samples
Chemicals, Peptides, and Recombinant Proteins
GFR Matrigel Corning Cat#356231
Bleomycin Sulfate Sigma-Aldrich Cat#B2434
Dispase Corning Cat#CB-40235
Collagenase/Dispase Roche Cat#10269638001
DNAse Sigma-Aldrich Cat#D4527
PmeI New England Biolabs Cat# R0560S
MluI New England Biolabs Cat# R0198S
Puromycin Old stock, unknown N/A
EcoRV New England Biolabs Cat# R0195S
Growth Factor Reduced Matrigel (3D Matrigel) Corning Cat# 356230
Human embryonic stem cell (hESC)-qualified Matrigel (2D Matrigel) Corning Cat# 354277
CHIR99021 (CHIR) Tocris Cat# 4423
Recombinant Human Keratinocyte Growth Factor (KGF) R&D Systems Cat# 251-KG-010
Recombinant Human BMP4 (rhBMP4) R&D Systems Cat# 314-BP
Hyclone Fetal Bovine Serum (characterized; FBS) GE Healthcare Life Sciences Cat# SH30071.03
Rho-associated kinase inhibitor (Y-27632 dihydrochloride; Y) Tocris Cat# 1254
0.05% Trypsin-EDTA Gibco Cat# 25–300-062
Dexamethasone (Dex) Sigma Aldrich Cat# D4902
3-Isobutyl-1-methylxanthine (IBMX) Sigma Aldrich Cat# I5879
8-Bromoadenosine 3′, 5′-cyclic monophosphate sodium salt (cAMP) Sigma Aldrich Cat# B7880
Retinoic Acid (Ra) Sigma Aldrich Cat# R2625
Doxycycline Hydrochloride (Dox) Sigma Aldrich Cat# D3072
Dimethyl Sulfoxide (DMSO) Sigma Aldrich Cat# D2650
Dorsomorphin (DS) Stemgent Cat# 04–0024
SB431542 (SB) Tocris Cat# 1614
Dispase II Thermo Fisher Scientific Cat# 17105–041
Ascorbic Acid Sigma Aldrich Cat# A4544
1-Thioglycerol (MTG) Sigma Aldrich Cat# M6145
BSA 7.5% Stock Thermo Fisher Scientific Cat# 15260037
4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (EDTA) Sigma Aldrich Cat# E7889
N-(2-Hydroxyethyl)piperazine-N′-(2-ethanesulfonic acid)Solution (HEPES) Sigma Aldrich Cat# H0887
     
     
Critical Commercial Assays
Chromium™ Single Cell 3' Library & Gel Bead Kit v2, 16 rxns 10X Genomics Cat#120237
Chromium™ Single Cell A Chip Kit, 48 rxns 10X Genomics Cat#120236
Chromium™ i7 Multiplex Kit, 96 rxns 10X Genomics Cat#120262
Amaxa™ P3 Primary Cell Kit Lonza Cat#V4XP-3024
Stem Diff Definitive Endoderm Kit StemCell Technologies Cat#05210
RNeasy Mini Kit Qiagen Cat#741404
Qiazol Lysis Reagent Qiagen Cat#79306
TaqMan Fast Universal PCR Master Mix (2X), no AmpErase UNG Thermo Fisher Scientific Cat#4364103
High-Capacity cDNA Reverse Transcription Kit Applied Biosystems Cat#4368814
     
Deposited Data
Jupyter notebooks for GEMM and organoid single-cell RNA-Seq analysis This paper https://github.com/alm8517/Kras_invivo_organoid
Single cell RNA-seq raw data (GEMM / organoid) This paper GEO - GSE149813 / GSE149909
Single cell RNA-seq features/matrix/barcode files (GEMM / organoid) This paper GEO - GSE149813 / GSE149909
Bulk RNA-Seq raw data This paper GEO - GSE150425
iAT2 single cell RNA-Seq data This paper GEO - GSE150263www.kottonlab.com
Code for iAT2 scRNA-Seq analysis This paper https://github.com/cvillamar/Vedaie_CReM
Human patient stage IA singe cell RNA-Seq data This paper GEO - GSE149655
Mass spectrometry proteomics data iAT2 This paper PRIDE - PXD019240
Experimental Models: Cell Lines
Human: Normal donor iPSC line targeted with NKX2–1GFP SFTPCtdTomato (BU3 NGST) Kotton Lab (Jacob et al. 2017) RRID: CVCL_WN82
Experimental Models: Organisms/Strains
Gt(ROSA)26Sortm1(EYFP)Cos The Jackson Laboratory Cat#006148
KrasLSL-G12D/+ Jackson et al., 2001 N/A
KrasLSL G12D/+; p53fl/fl Jackson et al., 2005 N/A
Hsd:Athymic Nude-Foxn1nu ENVIGO Cat#6903F
Oligonucleotides
hKRAS mutG12D PmeI:gtggcaagtttaaacATGACTGAATATAAACTTGTGGTAG Mostoslavsky Lab N/A
hKRAS mut G12D MluI:ccaatcaggccacgcgtTTACATAATTACACACTTTGTC Mostoslavsky Lab N/A
Z-AV-4gccggaactctgccctctaacgct Kotton Lab N/A
T2A RGATTCTCCTCCACGTCACCGC Mostoslavsky Lab N/A
Taqman Gene Expression Assay Primer/Probe Set: KRAS Thermo Fisher Scientific Hs00364284_g1
Taqman Gene Expression Assay Primer/Probe Set: NKX2–1 Thermo Fisher Scientific Hs00968940_m1
Taqman Gene Expression Assay Primer/Probe Set: SFTPC Thermo Fisher Scientific Hs00161628_m1
Recombinant DNA
pBabe-Kras G12D Channing Der Addgene plasmid # 58902 RRID:Addgene_58902
pZ P 4X(cHS4) TetON-3XFLAG-tdT CAGG-m2rtTA v2 (Ordovas et al., 2015) N/A
AAVS1 Zinc Finger R N/A
AAVS1 Zinc Finger L N/A
Software and Algorithms
ImageJ Schneider et al. 2012 https://imagej.nih.gov/ij/
GraphPad Prism for MacOS version 8.2.1 GraphPad Software https://www.graphpad.com/scientific-software/prism/
FlowJo version 10.5.3 Becton, Dickinson & Company https://www.flowjo.com/
Scanpy 1.4.4 Wolf et al. 2017 https://github.com/theislab/scanpy
Velocyto 0.17.16 La Manno et al. 2018 https://github.com/velocyto-team/velocyto.py
scVelo 0.1.25 Theis lab https://github.com/theislab/scvelo
CellRanger 3.0.0 10X Genomics https://support.10xgenomics.com/single-cell-gene-expression/software/ pipelines/latest/installation
Matplotlib 3.0.2 Hunter. 2007 https://matplotlib.org/index.html
Seaborn 0.9.0 https://seaborn.pydata.org/#
Enrichr in gseapy 0.9.13 Kuleshov et al. 2016 https://github.com/zqfang/GSEApy/blob/master/docs/index.rst
Markov Affinity-based Graph Imputation of Cells (MAGIC) 1.5.5 van Dijk et al. 2018 https://github.com/KrishnaswamyLab/MAGIC
Other
ProLong™ Gold Antifade Mountant with DAPI Invitrogen Cat#P36935
DAPI Sigma-Aldrich Cat#D9542
Transwells Corning Cat#3470
SPRI Select Reagent Beckman Coulter Cat#NC0406407
Qubit™ dsDNA HS Assay Kit Invitrogen Cat#Q32851
Calcein Blue AM Life Technologies Cat#C1429
Hank’s Buffered Saline Solution (HBSS; no calcium, no magnesium, no phenol red) Gibco Cat#14175095
Gentle Cell Dissociation Reagent StemCell Technologies Cat#07174
GlutaMAX (100x) Thermo Fisher Scientific Cat#35050–061
Ham’s F12 Medium Cellgro Cat#10–080-CV
Iscove’s Modified Dulbecco’s Medium(IMDM) Thermo Fisher Scientific Cat#12440053
N2 Supplement Invitrogen Cat#17502–048
B27 Supplement Invitrogen Cat#15260–037
Primocin Invitrogen Cat#NC9141851
mTeSR1 StemCell Technologies Cat#05850

Highlights.

  • Alveolar progenitor (AT2) cells are transcriptionally distinct upon KRAS expression

  • Alveolar epithelial organoids recapitulate early-stage lung adenocarcinoma

  • Oncogenic KRAS leads to loss of lineage identity in AT2 cells

  • Bulk, scRNAseq, and proteomic data from murine and human KRAS mutant AT2 cells

Acknowledgments

We thank members of the Kim, Kotton, Dubinett, and Yanagawa labs for helpful discussion, the Boston Children’s Hospital Flow Cytometry Core Facility, the Harvard Medical School Single-Cell Core Facility for use of their 10X Genomics Chromium Controller, the Harvard Medical School Biopolymers Facility for Illumina NextSeq500 Sequencing and cDNA/library quality control experiments, Yuriy Alekseyev and M.J. Mistretta of the Boston University School of Medicine (BUSM) Single Cell Sequencing Core, Brian R. Tilton of the BUSM Flow Cytometry Core, Greg Miller, The Center for Regenerative Medicine (CReM) Laboratory Manager, and Marianne James, CReM iPSC Core Manager, Lauren Winter and Tamara Silva for administrative support, the UCLA Translational Pathology Core Laboratory, the UCLA Jonsson Comprehensive Cancer Center for shared resources, and the Rodent Histopathology Facility at Harvard Medical School, and the Molecular Biology Core Facility at Dana-Faber Cancer Institute for library preparation, sequencing and data analysis of the bulk RNA-seq.

A.F.M.D is supported by a Boehringer Ingelheim Fonds PhD fellowship; A.L.M was supported by an NIH R01 Postdoctoral Supplement and currently by a Damon Runyon Cancer Research Foundation Postdoctoral Fellowship (DRG:2368–19) and a Postdoctoral Enrichment Program Award from the Burroughs Wellcome Fund; S.M.L is supported by a Hope Funds for Cancer Research Postdoctoral Fellowship, and S.P.R by an IASLC Young Investigator Fellowship. This work was supported in part by R01 HL090136, R01 HL132266, R01 HL125821, U01 HL100402 RFA-HL-09–004, R35HL150876–01, American Cancer Society Research Scholar Grant RSG-08–082-01-MGO, the V Foundation for Cancer

Research, the Thoracic Foundation, the Ellison Foundation, the American Lung Association LCD-619492 and the Harvard Stem Cell Institute (C.F.K.).

CReM was supported by grants R24HL123828 and U01TR001810. D.N.K is supported by R01HL128172, R01HL095993, R01HL122442, U01HL134745, and U01HL134766. The iPSC model development and characterization (D.N.K., G.M., A.E., C.V., D.H., M.V., R.H., R.H.K., and B.C.B.) was sponsored by the Lung Cancer Initiative at Johnson & Johnson. A.E. acknowledges the generous start-up funding and ongoing support of Boston University to support the operations of the CNSB.

A Stand Up To Cancer-LUNGevity-American Lung Association Lung Cancer Interception Dream Team Translational Cancer Research Grant (Grant Number: SU2C-AACR-DT23–17). Stand Up To Cancer is a division of the Entertainment Industry Foundation. Research grants are administered by the American Association for Cancer Research, the scientific partner of SU2C (S.M.D.). NIH/NCI Molecular Characterization Laboratory 5U01CA196408–04 (S.M.D.). UC Tobacco-Related Disease Research Program (TRDRP) 27IR-0036 (K.K.). Thoracic Surgery Foundation Research Award (E.F.).

C.F.K has a sponsored research agreement from Celgene/BMS, but this funding did not support the research described in this manuscript.

Footnotes

Declaration of Interests

William D. Wallace is a Member of the Leica Biosystems Medical Imaging Advisory Board.

Steven M. Dubinett is on the Scientific Advisory Boards of EarlyDiagnostics, Johnson & Johnson Lung Cancer Initiative, LungLife AI, Inc., and T-Cure Bioscience, Inc. He has received research funding from Johnson & Johnson Lung Cancer Initiative and Novartis.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Alanis DM, Chang DR, Akiyama H, Krasnow MA, and Chen J. (2014). Two nested developmental waves demarcate a compartment boundary in the mouse lung. Nat. Commun. 5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Antonängelo L, Tuma T, Fabro A, Acencio M, Terra R, Parra E, Vargas F, Takagaki T, and Capelozzi V. (2016). Id-1, Id-2, and Id-3 co-expression correlates with prognosis in stage I and II lung adenocarcinoma patients treated with surgery and adjuvant chemotherapy. Exp. Biol. Med. 241, 1159–1168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Barbie DA, Tamayo P, Boehm JS, Kim SY, Moody SE, Dunn IF, Schinzel AC, Sandy P, Meylan E, Scholl C, et al. (2009). Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462, 108–112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bender Kim CF, Jackson EL, Woolfenden AE, Lawrence S, Babar I, Vogel S, Crowley D, Bronson RT, and Jacks T. (2005). Identification of bronchioalveolar stem cells in normal lung and lung cancer. Cell 121, 823–835. [DOI] [PubMed] [Google Scholar]
  5. Bild AH, Yao G, Chang JT, Wang Q, Potti A, Chasse D, Joshi MB, Harpole D, Lancaster JM, Berchuck A, et al. (2006). Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature 439, 353–357. [DOI] [PubMed] [Google Scholar]
  6. Blondel VD, Guillaume J-L, Lambiotte R, and Lefebvre E. (2008). Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, P10008. [Google Scholar]
  7. Bourgon R, Gentleman R, and Huber W. (2010). Independent filtering increases detection power for high-throughput experiments. Proc. Natl. Acad. Sci. U. S. A. 107, 9546–9551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Chang DR, Alanis DM, Miller RK, Ji H, Akiyama H, McCrea PD, and Chen J. (2013). Lung epithelial branching program antagonizes alveolar differentiation. Proc. Natl. Acad. Sci. U. S. A. 110, 18042–18051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Chen H, Liu H, and Qing G. (2018). Targeting oncogenic Myc as a strategy for cancer treatment. Signal Transduct. Target. Ther. 3, 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cheng YJ, Tsai JW, Hsieh KC, Yang YC, Chen YJ, Huang MS, and Yuan SS (2011). Id1 promotes lung cancer cell proliferation and tumor growth through Akt-related pathway. Cancer Lett. 307, 191–199. [DOI] [PubMed] [Google Scholar]
  11. Collisson EA, Campbell JD, Brooks AN, Berger AH, Lee W, Chmielecki J, Beer DG, Cope L, Creighton CJ, Danilova L, et al. (2014). Comprehensive molecular profiling of lung adenocarcinoma: The cancer genome atlas research network. Nature 511, 543–550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cornwell MI, Vangala M, Taing L, Herbert Z, Köster J, Li B, Sun H, Li T, Zhang J, Qiu X, et al. (2018). VIPER: Visualization Pipeline for RNA-seq, a Snakemake workflow for efficient and complete RNA-seq analysis. BMC Bioinformatics 19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Cox J, Neuhauser N, Michalski A, Scheltema RA, Olsen JV, and Mann M. (2011). Andromeda: A peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794–1805. [DOI] [PubMed] [Google Scholar]
  14. Curtis SJ, Sinkevicius KW, Li D, Lau AN, Roach RR, Zamponi R, Woolfenden AE, Kirsch DG, Wong KK, and Kim CF (2010). Primary tumor genotype is an important determinant in identification of lung cancer propagating cells. Cell Stem Cell 7, 127–133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Dang CV (2012). MYC on the path to cancer. Cell 149, 22–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Deutsch EW, Bandeira N, Sharma V, Perez-Riverol Y, Carver JJ, Kundu DJ, Garcia-Seisdedos D, Jarnuczak AF, Hewapathirana S, Pullman BS, et al. (2020). The ProteomeXchange consortium in 2020: enabling “big data” approaches in proteomics. Nucleic Acids Res. 48, 1145–1152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Diaz-Papkovich A, Anderson-Trocme L, Ben-Eghan C, and Gravel S. (2019). UMAP reveals cryptic population structure and phenotype heterogeneity in large genomic cohorts. PLOS Genet. 15, e1008432. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. van Dijk D, Sharma R, Nainys J, Yim K, Kathail P, Carr AJ, Burdziak C, Moon KR, Chaffer CL, Pattabiraman D, et al. (2018). Recovering Gene Interactions from Single-Cell Data Using Data Diffusion. Cell 174, 716–729.e27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, and Gingeras TR (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Drost J, Van Jaarsveld RH, Ponsioen B, Zimberlin C, Van Boxtel R, Buijs A, Sachs N, Overmeer RM, Offerhaus GJ, Begthel H, et al. (2015). Sequential cancer mutations in cultured human intestinal stem cells. Nature 521, 43–47. [DOI] [PubMed] [Google Scholar]
  21. DuPage M, Dooley AL, and Jacks T. (2009). Conditional mouse lung cancer models using adenoviral or lentiviral delivery of Cre recombinase. Nat. Protoc. 4, 1064–1072. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Finak G, McDavid A, Yajima M, Deng J, Gersuk V, Shalek AK, Slichter CK, Miller HW, McElrath MJ, Prlic M, et al. (2015). MAST: A flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Franzén O, Gan L-M, and Björkegren JLM (2019). PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data. Database 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Hafemeister C, and Satija R. (2019). Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 20, 296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Haigis KM (2017). KRAS Alleles: The Devil Is in the Detail. Trends in Cancer 3, 686–697. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Han H, Cho JW, Lee S, Yun A, Kim H, Bae D, Yang S, Kim CY, Lee M, Kim E, et al. (2018). TRRUST v2: An expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res. 46, D380–D386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Herriges JC, Verheyden JM, Zhang Z, Sui P, Zhang Y, Anderson MJ, Swing DA, Zhang Y, Lewandoski M, and Sun X. (2015). FGF-Regulated ETV Transcription Factors Control FGF-SHH Feedback Loop in Lung Branching. Dev. Cell 35, 322–332. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Hu H, Miao Y-R, Jia L-H, Yu Q-Y, Zhang Q, and Guo A-Y (2019). AnimalTFDB 3.0: a comprehensive resource for annotation and prediction of animal transcription factors. Nucleic Acids Res. 47, D33–D38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Hunter JD (2007). Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 9, 99–104. [Google Scholar]
  30. Hurley K, Ding J, Villacorta-Martin C, Herriges MJ, Jacob A, Vedaie M, Alysandratos KD, Sun YL, Lin C, Werder RB, et al. (2020). Reconstructed Single-Cell Fate Trajectories Define Lineage Plasticity Windows during Differentiation of Human PSC-Derived Distal Lung Progenitors. Cell Stem Cell. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Jackson EL, Willis N, Mercer K, Bronson RT, Crowley D, Montoya R, Jacks T, and Tuveson DA (2001). Analysis of lung tumor initiation and progression using conditional expression of oncogenic K-ras. Genes Dev. 15, 3243–3248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Jackson EL, Olive KP, Tuveson DA, Bronson R, Crowley D, Brown M, and Jacks T. (2005). The differential effects of mutant p53 alleles on advanced murine lung cancer. Cancer Res. 65, 10280–10288. [DOI] [PubMed] [Google Scholar]
  33. Jacob A, Morley M, Hawkins F, McCauley KB, Jean JC, Heins H, Na CL, Weaver TE, Vedaie M, Hurley K, et al. (2017). Differentiation of Human Pluripotent Stem Cells into Functional Lung Alveolar Epithelial Cells. Cell Stem Cell 21, 472–488.e10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Jacob A, Vedaie M, Roberts DA, Thomas DC, Villacorta-Martin C, Alysandratos KD, Hawkins F, and Kotton DN (2019). Derivation of self-renewing lung alveolar epithelial type II cells from human pluripotent stem cells. Nat. Protoc. 14, 3303–3332. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Jiang SS, Fang WT, Hou YH, Huang SF, Yen BL, Chang JL, Li SM, Liu HP, Liu YL, Huang CT, et al. (2010). Upregulation of SOX9 in lung adenocarcinoma and its involvement in the regulation of cell growth and tumorigenicity. Clin. Cancer Res. 16, 4363–4373. [DOI] [PubMed] [Google Scholar]
  36. Kaisani A, Delgado O, Fasciani G, Kim SB, Wright WE, Minna JD, and Shay JW (2014). Branching morphogenesis of immortalized human bronchial epithelial cells in three-dimensional culture. Differentiation 87, 119–126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kim M, Mun H, Sung CO, Cho EJ, Jeon HJ, Chun SM, Jung DJ, Shin TH, Jeong GS, Kim DK, et al. (2019). Patient-derived lung cancer organoids as in vitro cancer models for therapeutic screening. Nat. Commun. 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kruspig B, Monteverde T, Neidler S, Hock A, Kerr E, Nixon C, Clark W, Hedley A, Laing S, Coffelt SB, et al. (2018). The ERBB network facilitates KRAS-driven lung tumorigenesis. Sci. Transl. Med. 10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Kulesa PM, Morrison JA, and Bailey CM (2013). The neural crest and cancer: A developmental spin on melanoma. Cells Tissues Organs 198, 12–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Kuleshov MV, Jones MR, Rouillard AD, Fernandez NF, Duan Q, Wang Z, Koplev S, Jenkins SL, Jagodnik KM, Lachmann A, et al. (2016). Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Lau AN, Curtis SJ, Fillmore CM, Rowbotham SP, Mohseni M, Wagner DE, Beede AM, Montoro DT, Sinkevicius KW, Walton ZE, et al. (2014). Tumor-propagating cells and Yap/Taz activity contribute to lung tumor progression and metastasis. EMBO J. 33, 1502–1502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Laughney AM, Hu J, Campbell NR, Bakhoum SF, Setty M, Lavallée VP, Xie Y, Masilionis I, Carr AJ, Kottapalli S, et al. (2020). Regenerative lineages and immune-mediated pruning in lung cancer metastasis. Nat. Med. 26, 259–269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Lee J-H, Bhang DH, Beede A, Huang TL, Stripp BR, Bloch KD, Wagers AJ, Tseng Y-H, Ryeom S, and Kim CF (2014). Lung stem cell differentiation in mice directed by endothelial cells via a BMP4-NFATc1-thrombospondin-1 axis. Cell 156, 440–455. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Lee J-H, Tammela T, Hofree M, Jacks T, Regev A, and Kim Correspondence CF (2017). Anatomically and Functionally Distinct Lung Mesenchymal Populations Marked by Lgr5 and Lgr6. Cell 170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Leutert M, Rodríguez-Mias RA, Fukuda NK, and Villén J. (2019). R2-P2 rapid-robotic phosphoproteomics enables multidimensional cell signaling studies. Mol. Syst. Biol. 15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Li L, Xu B, Zhang H, Wu J, Song Q, and Yu J. (2020). Potentiality of forkhead box Q1 as a biomarker for monitoring tumor features and predicting prognosis in non-small cell lung cancer. J. Clin. Lab. Anal. 34. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Li X, Nadauld L, Ootani A, Corney DC, Pai RK, Gevaert O, Cantrell MA, Rack PG, Neal JT, Chan CWM, et al. (2014). Oncogenic transformation of diverse gastrointestinal tissues in primary organoid culture. Nat. Med. 20, 769–777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Lin C, Song H, Huang C, Yao E, Gacayan R, Xu S-M, and Chuang P-T (2012). Alveolar type II cells possess the capability of initiating lung tumor development. PLoS One 7, e53817. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Liu Q, Liu K, Cui G, Huang X, Yao S, Guo W, Qin Z, Li Y, Yang R, Pu W, et al. (2019). Lung regeneration by multipotent stem cells residing at the bronchioalveolar-duct junction. Nat. Genet. 51, 728–738. [DOI] [PubMed] [Google Scholar]
  50. Love MI, Huber W, and Anders S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Maeda Y, Davé V, and Whitsett JA (2007). Transcriptional Control of Lung Morphogenesis. Physiol. Rev. 87, 219–244. [DOI] [PubMed] [Google Scholar]
  52. La Manno G, Soldatov R, Zeisel A, Braun E, Hochgerner H, Petukhov V, Lidschreiber K, Kastriti ME, Lönnerberg P, Furlan A, et al. (2018). RNA velocity of single cells. Nature 560, 494–498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Matano M, Date S, Shimokawa M, Takano A, Fujii M, Ohta Y, Watanabe T, Kanai T, and Sato T. (2015). Modeling colorectal cancer using CRISPR-Cas9-mediated engineering of human intestinal organoids. Nat. Med. 21, 256–262. [DOI] [PubMed] [Google Scholar]
  54. McCauley KB, Hawkins F, Serra M, Thomas DC, Jacob A, and Kotton DN (2017). Efficient Derivation of Functional Human Airway Epithelium from Pluripotent Stem Cells via Temporal Regulation of Wnt Signaling. Cell Stem Cell 20, 844–857.e6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Meylan E, Dooley AL, Feldser DM, Shen L, Turk E, Ouyang C, and Jacks T. (2009). Requirement for NF-B signalling in a mouse model of lung adenocarcinoma. Nature 462, 104–107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Morrisey EE, and Hogan BLM (2010). Preparing for the First Breath: Genetic and Cellular Mechanisms in Lung Development. Dev. Cell 18, 8–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Neidler S, Kruspig B, Hewit K, Monteverde T, Gyuraszova K, Braun A, Clark W, James D, Hedley A, Nieswandt B, et al. (2019). Identification of a clinically relevant signature for early progression in KRAS-driven lung adenocarcinoma. Cancers (Basel). 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Nieto MA (2013). Epithelial plasticity: A common theme in embryonic and cancer cells. Science (80-. ). 342. [DOI] [PubMed] [Google Scholar]
  59. Nikolić MZ, Caritg O, Jeng Q, Johnson JA, Sun D, Howell KJ, Brady JL, Laresgoiti U, Allen G, Butler R, et al. (2017). Human embryonic lung epithelial tips are multipotent progenitors that can be expanded in vitro as long-term self-renewing organoids. Elife 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Ordovás L, Boon R, Pistoni M, Chen Y, Wolfs E, Guo W, Sambathkumar R, Bobis-Wozowicz S, Helsen N, Vanhove J, et al. (2015). Efficient recombinase-mediated cassette exchange in hPSCs to study the hepatocyte lineage reveals AAVS1 locus-mediated transgene inhibition. Stem Cell Reports 5, 918–931. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Perez-Riverol Y, Csordas A, Bai J, Bernal-Llinares M, Hewapathirana S, Kundu DJ, Inuganti A, Griss J, Mayer G, Eisenacher M, et al. (2019). The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Pfannkuche K, Summer H, Li O, Hescheler J, and Dröge P. (2009). The high mobility group protein HMGA2: A co-regulator of chromatin structure and pluripotency in stem cells? Stem Cell Rev. Reports 5, 224–230. [DOI] [PubMed] [Google Scholar]
  63. Phipson B, Lee S, Majewski IJ, Alexander WS, and Smyth GK (2016). Robust hyperparameter estimation protects against hypervariable genes and improves power to detect differential expression. Ann. Appl. Stat. 10, 946–963. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Pillai S, Rizwani W, Li X, Rawal B, Nair S, Schell MJ, Bepler G, Haura E, Coppola D, and Chellappan S. (2011). ID1 Facilitates the Growth and Metastasis of Non-Small Cell Lung Cancer in Response to Nicotinic Acetylcholine Receptor and Epidermal Growth Factor Receptor Signaling. Mol. Cell. Biol. 31, 3052–3067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Poole CJ, and van Riggelen J. (2017). MYC—master regulator of the cancer epigenome and transcriptome. Genes (Basel). 8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Sachs N, Papaspyropoulos A, Zomer-van Ommen DD, Heo I, Böttinger L, Klay D, Weeber F, Huelsz-Prince G, Iakobachvili N, Amatngalim GD, et al. (2019). Long-term expanding human airway organoids for disease modeling. EMBO J. 38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Salwig I, Spitznagel B, Vazquez-Armendariz AI, Khalooghi K, Guenther S, Herold S, Szibor M, and Braun T. (2019). Bronchioalveolar stem cells are a main source for regeneration of distal lung epithelia in vivo. EMBO J. 38. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Seino T, Kawasaki S, Shimokawa M, Tamagawa H, Toshimitsu K, Fujii M, Ohta Y, Matano M, Nanki K, Kawasaki K, et al. (2018). Human Pancreatic Tumor Organoids Reveal Loss of Stem Cell Niche Factor Dependence during Disease Progression. Cell Stem Cell 22, 454–467.e6. [DOI] [PubMed] [Google Scholar]
  69. Sergushichev AA (2016). An algorithm for fast preranked gene set enrichment analysis using cumulative statistic calculation. BioRxiv 060012. [Google Scholar]
  70. Singh I, Mehta A, Contreras A, Boettger T, Carraro G, Wheeler M, Cabrera-Fuentes HA, Bellusci S, Seeger W, Braun T, et al. (2014). Hmga2 is required for canonical WNT signaling during lung development. BMC Biol. 12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Soneson C, and Robinson MD (2018). Bias, robustness and scalability in single-cell differential expression analysis. Nat. Methods 15, 255–261. [DOI] [PubMed] [Google Scholar]
  72. Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM, Hao Y, Stoeckius M, Smibert P, and Satija R. (2019). Comprehensive Integration of Single-Cell Data. Cell 177, 1888–1902.e21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Takahashi K, and Yamanaka S. (2006). Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors. Cell 126, 663–676. [DOI] [PubMed] [Google Scholar]
  74. Tammela T, Sanchez-Rivera FJ, Cetinbas NM, Wu K, Joshi NS, Helenius K, Park Y, Azimi R, Kerper NR, Wesselhoeft RA, et al. (2017). A Wnt-producing niche drives proliferative potential and progression in lung adenocarcinoma. Nature 545, 355–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Thiery JP (2002). Epithelial-mesenchymal transitions in tumour progression. Nat. Rev. Cancer 2, 442–454. [DOI] [PubMed] [Google Scholar]
  76. Tirosh I, Izar B, Prakadan SM, Wadsworth MH, Treacy D, Trombetta JJ, Rotem A, Rodman C, Lian C, Murphy G, et al. (2016). Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science (80-. ). 352, 189–196. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Tiyaboonchai A, Mac H, Shamsedeen R, Mills JA, Kishore S, French DL, and Gadue P. (2014). Utilization of the AAVS1 safe harbor locus for hematopoietic specific transgene expression and gene knockdown in human ES cells. Stem Cell Res. 12, 630–637. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, Van Baren MJ, Salzberg SL, Wold BJ, and Pachter L. (2010). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Travaglini KJ, Nabhan AN, Penland L, Sinha R, Gillich A, Sit RV, Chang S, Conley SD, Mori Y, Seita J, et al. (2019). A molecular cell atlas of the human lung from single cell RNA sequencing. BioRxiv 742320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Tuveson DA, Shaw AT, Willis NA, Silver DP, Jackson EL, Chang S, Mercer KL, Grochow R, Hock H, Crowley D, et al. (2004). Endogenous oncogenic K-rasG12D stimulates proliferation and widespread neoplastic and developmental defects. Cancer Cell 5, 375–387. [DOI] [PubMed] [Google Scholar]
  81. Winslow MM, Dayton TL, Verhaak RGW, Kim-Kiselak C, Snyder EL, Feldser DM, Hubbard DD, Dupage MJ, Whittaker CA, Hoersch S, et al. (2011). Suppression of lung adenocarcinoma progression by Nkx2–1. Nature 473, 101–104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Wolf FA, Angerer P, and Theis FJ (2018). SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Xu X, Rock JR, Lu Y, Futtner C, Schwab B, Guinney J, Hogan BLM, and Onaitis MW (2012). Evidence for type II cells as cells of origin of K-Ras-induced distal lung adenocarcinoma. Proc. Natl. Acad. Sci. U. S. A. 109, 4910–4915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Yang J, and Weinberg RA (2008). Epithelial-Mesenchymal Transition: At the Crossroads of Development and Tumor Metastasis. Dev. Cell 14, 818–829. [DOI] [PubMed] [Google Scholar]
  85. Zacharias WJ, Frank DB, Zepp JA, Morley MP, Alkhaleel FA, Kong J, Zhou S, Cantu E, and Morrisey EE (2018). Regeneration of the lung alveolus by an evolutionarily conserved epithelial progenitor. Nature 555, 251–255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Zhang H, Brainson CF, Koyama S, Redig AJ, Chen T, Li S, Gupta M, Garcia-de-alba C, Paschini M, Herter-sprie GS, et al. (2017a). Lkb1 inactivation drives lung cancer lineage switching governed by Polycomb Repressive Complex 2. Nat. Commun. 8, 1–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Zhang Z, Newton K, Kummerfeld SK, Webster J, Kirkpatrick DS, Phu L, Eastham-Anderson J, Liu J, Lee WP, Wu J, et al. (2017b). Transcription factor Etv5 is essential for the maintenance of alveolar type II cells. Proc. Natl. Acad. Sci. U. S. A. 114, 3903–3908. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Zhou CH, Ye LP, Ye SX, Li Y, Zhang XY, Xu XY, and Gong LY (2012). Clinical significance of SOX9 in human non-small cell lung cancer progression and overall patient survival. J. Exp. Clin. Cancer Res. 31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Zhu Z, Golay HG, and Barbie DA (2014). Targeting pathways downstream of KRAS in lung adenocarcinoma. Pharmacogenomics 15, 1507–1518. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

2
3

Table S1, related to figures 1, S1, 4, S4, S5, 6, and S6: TF/TCF and gene signature lists. List of TF/TCF, and genes used to create signature z-scores for KRAS, AT2 identity, Sox9 targets, RELA/NF-kappaB targets, and proliferation.

4

Table S2, related to figure 1 and S1: Differentially expressed genes and Gene Ontology (GO) analysis in scRNA-Seq data of LUAD GEMM. Top 1000 differentially upregulated genes in each cluster unfiltered and filtered for TF/TCF; n = gene name, p = p-value, l = log2-fold change. Identification of unique and common GO terms between C0 and C1 in early-stage lung cancer GEMM. All common and unique pathways are statistically significant.

5

Table S3, related to figure 3 and S3: Bulk RNA-Seq gene lists of organoids. Differentially expressed genes in KY-CRE vs. KY-Emp (KY-Dif) and KPY-CRE vs. KPY-Emp (KPY-Dif), and list of shared up- and down-regulated genes of KY-Dif compared to KPY-Dif.

6

Table S4, related to figure 4 and S4: Differentially expressed genes and Gene Ontology (GO) analysis in scRNA-Seq data of KY organoids. Top 1000 differentially upregulated genes in each cluster unfiltered and filtered for TF/TCF; n = gene name, p = p-value, l = log2-fold change. Identification of unique GO terms between all clusters. All unique pathways are statistically significant.

7

Table S5, related to figure 5 and S5: Differentially expressed genes in scRNA-Seq data of iAT2s. Proteomics and Phosphoproteomics results of iAT2s. DEG analysis and mass spectometry results of KRASG12D expressing iAT2 (dox) vs control (DMSO)

8

Table S6, related to figure 6 and S6: Differentially expressed genes in scRNA-Seq data of LUAD stage IA patient samples. DEG analysis comparing AT2 stage IA to AT2 normal.

Data Availability Statement

Raw and processed single-cell and bulk RNA-seq data were deposited to the NCBI Gene Expression Omnibus (GEO) and Sequencing Read Archive (SRA) under the following accession codes:

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the Proteomics Identifications (PRIDE) partner repository (Deutsch et al., 2020; Perez-Riverol et al., 2019) with the dataset identifier PXD019240.

RESOURCES